Mitchell-Riley syndrome (MRS) is caused by recessive mutations in the regulatory factor X6 gene (RFX6) and is characterised by pancreatic hypoplasia and neonatal diabetes. To determine why individuals with MRS specifically lack pancreatic endocrine cells, we micro-CT imaged a 12-week-old foetus homozygous for the nonsense mutation RFX6 c.1129C>T, which revealed loss of the pancreas body and tail. From this foetus, we derived iPSCs and show that differentiation of these cells in vitro proceeds normally until generation of pancreatic endoderm, which is significantly reduced. We additionally generated an RFX6HA reporter allele by gene targeting in wild-type H9 cells to precisely define RFX6 expression and in parallel performed in situ hybridisation for RFX6 in the dorsal pancreatic bud of a Carnegie stage 14 human embryo. Both in vitro and in vivo, we find that RFX6 specifically labels a subset of PDX1-expressing pancreatic endoderm. In summary, RFX6 is essential for efficient differentiation of pancreatic endoderm, and its absence in individuals with MRS specifically impairs formation of endocrine cells of the pancreas head and tail.
Pancreas development begins with the emergence of the dorsal and ventral pancreatic buds from the posterior foregut (PFG) (Bastidas-Ponce et al., 2017). Progenitor cells in these buds proliferate extensively before giving rise to the endocrine, duct and acinar lineages. The exocrine pancreas is composed of acinar cells that produce digestive enzymes and duct cells that carry them to the digestive tract. Pancreatic endocrine cells reside in the islets of Langerhans and each secretes a different hormone involved in regulating blood glycaemia: β-cells (insulin), α-cells (glucagon), δ-cells (somatostatin), ε-cells (ghrelin) and PP cells (pancreatic polypeptide). Failure of the β-cells to secrete sufficient insulin results in persistent hyperglycaemia and diabetes. Type 1 and 2 diabetes account for the vast majority of cases and are the result of a complex interplay between numerous genetic and environmental factors. In contrast, maturity onset diabetes of the young (MODY) and neonatal diabetes are extremely rare (1-2% of individuals with diabetes) and are usually caused by mutations in single genes, typically encoding factors important for the development or function of β-cells (reviewed by Schwitzgebel, 2014). As well as aiding diagnosis and prognosis, studies of causative genes can shed light on the roles of these factors during normal development, as well as in more common forms of diabetes.
Mitchell-Riley syndrome (MRS) (https://www.omim.org/entry/615710) is caused by autosomal recessive mutations in the winged-helix transcription factor regulatory factor X6 gene (RFX6) (Smith et al., 2010). MRS is characterised by intrauterine growth retardation, atresia of the duodenum, atresia or hypoplasia of the gall bladder, annular or hypoplastic pancreas, and permanent neonatal diabetes (Mitchell et al., 2004; Smith et al., 2010; Spiegel et al., 2011). In severe cases, these symptoms may be accompanied by chronic diarrhoea, biliary defects, including cholestasis, and varying degrees of pancreatic exocrine deficiency, with death typically occurring within the first few years of life. The spectrum of congenital abnormalities observed in MRS suggests that either RFX6 is important for patterning of the PFG, from which the affected tissues are derived, or during subsequent development and specialisation of these tissues. In the case of the pancreas, individuals with MRS do develop this organ but it contains varying quantities of functional exocrine tissue and very few chromogranin A-positive (CHGA+) endocrine cells (Mitchell et al., 2004). Together, these observations suggest that RFX6 is required during development of the pancreas itself, and not during the initial patterning of the PFG.
Genetic studies in the mouse point to a similar but different role for the Rfx6 gene in pancreas development. Like human patients, Rfx6-null mice show obstruction of the small intestine, variable degrees of pancreatic hypoplasia and premature death (Smith et al., 2010). Significantly, however, germline deletion of Rfx6 or conditional inactivation of Rfx6 in neurogenin 3-positive (Ngn3+) endocrine precursor cells does not affect the generation of Chga+ endocrine cells, although they do not contain insulin, glucagon or somatostatin (Smith et al., 2010; Piccand et al., 2014). Thus, Rfx6 is not required for the formation of endocrine cells in the mouse, but only for their correct differentiation and consequently downstream function in maintaining homeostasis. This discrepancy highlights the need to specifically develop models to study RFX6 function in the human pancreas.
To achieve this goal, we derived induced pluripotent stem cells (iPSCs) from two individuals with MRS and their unaffected father. Micro-CT imaging of one of these individuals confirmed the spectrum of congenital defects typical of MRS and exome sequencing identified a homozygous nonsense mutation in RFX6. MRS patient-derived iPSCs form a normal PFG, but exhibit reduced efficiency of differentiation into pancreatic endoderm. Aberrant differentiation is accompanied by expression of genes associated with mesoderm differentiation, indicating that RFX6 is crucial for maintaining the transcriptional program that specifies pancreatic endoderm. Finally, we demonstrate for the first time that RFX6 is expressed in the nascent pancreatic bud of Carnegie stage 14 (CS14) human embryos.
A consanguineous family with MRS patients
We identified a consanguineous Syrian family in which three newborn children exhibited symptoms characteristic of MRS (Fig. 1A). The eldest two female siblings (III:1 and III:2) presented with neonatal diabetes, anaemia and intrauterine growth restriction, weighing 1.15 and 1.3 kg at birth (below the 2nd centile), respectively, and died within a few days of birth. Their stomachs, but not bowel, were distended by gas, consistent with congenital duodenal atresia. The third daughter (III:3) was born with the same symptoms as the eldest two and also died a few days after birth. In this case, further analysis failed to detect abnormalities of the brain, skull, face, heart, lungs or spine, and revealed a normal karyotype with no evidence of structural abnormalities. Exome sequencing revealed a homozygous, nonsense mutation in RFX6 (RFX6 c.1129C>T) (Fig. 1B and described in detail below). During the fourth pregnancy (III:4), chorionic villus sampling determined that the foetus was homozygous for the same RFX6 mutation as patient III:3. Based on this information, the pregnancy was electively terminated and the foetus preserved at the 12th week. The fifth and final pregnancy resulted in the birth of a healthy male child (III:5), who was found to be heterozygous for the RFX6 c.1129C>T mutation. Skin fibroblasts were obtained from patients III:3 and III:4, and their healthy, heterozygous father (II:1), and reprogrammed into iPSC. All five iPSC lines were found to be karyotypically normal (Fig. S1A and data not shown).
To better understand at which stage pancreas development fails in individuals with MRS, we carefully chronicled at high spatial resolution the development of the pancreas in healthy human embryos. The pancreas is a composite organ derived from two primordia that arise from the ventral and dorsal sides of the distal foregut, commonly referred to as the ventral and dorsal pancreatic buds. These buds appear histologically as early as CS13, corresponding to 28-32 days post-conception (Fig. S2). Thereafter, the ventral pancreatic bud undergoes complex morphogenesis, moving towards the dorsal bud, with the two fusing at CS17/18 (42-48 days post-conception) to form the final template for the adult pancreas. The pancreatic head is formed from the ventral bud, and the body and tail mainly from the dorsal bud. Although similar morphological changes have been shown in mouse, this is the first time movement of the buds and the exact timing of these events have been pinpointed in human embryos in an interactive three-dimensional fashion (see 3D interactive PDF in the supplementary information).
To visualise congenital defects due to loss of RFX6, we micro-CT scanned the 12-week foetus III:4 and a healthy control of similar age (Fig. 1C). The RFX6−/− foetus lacks the pancreas body and tail, with the maximal pancreas length roughly half that of the comparably staged wild-type foetus (3.81 mm RFX6−/− versus 7.29 mm wild type), lacks the gall bladder and additionally displays duodenal atresia (Fig. S1B). All detected abnormalities are in line with the spectrum of congenital defects typical of MRS (Smith et al., 2010; Spiegel et al., 2011 and https://www.omim.org/entry/615710). No abnormalities were observed in other abdominal organs (liver, spleen, kidneys, etc.) (Movie 1). We therefore propose that a defect specific to the dorsal pancreatic bud underlies the absence of the pancreas body and tail.
RFX6 c.1129C>T encodes a truncated protein p.Arg377Ter
In order to model human pancreatic development in individuals with MRS, we used a well-established, robust 12-day directed protocol adapted from Rezania et al. (2014) (schematised in Fig. 2A) to differentiate the control H9 human embryonic stem cell (hESC) line and iPSCs derived from patient III:4 and her carrier father into pancreatic progenitors. We have previously demonstrated that this platform yields near-homogenous definitive endoderm by day 4 (SOX17+;CXCR4+), followed by PGT (HNF6+;FOXA2+) on day 7, pancreatic endoderm (PDX1+;SOX9+) on day 10 and pancreatic progenitors (PDX1+;NKX6-1+) on day 12, even when applied to genetically diverse human pluripotent stem cells (hPSC) (Trott et al., 2017). To investigate the consequences of the RFX6 c.1129C>T mutation at the mRNA level, we first documented RFX6 expression over the course of 12 days of in vitro differentiation. Quantitative RT-PCR reveals that RFX6 expression peaks on day 8 of differentiation in wild-type H9 cells (Fig. 2B) – a result consistent with the finding that RFX6 is expressed in the dorsal pancreatic bud at CS12-14, coinciding with the emergence of pancreatic endoderm (Jennings et al., 2017). In contrast, iPSCs from both patient III:4 and her father express RFX6, but at levels significantly lower than in H9 cells on day 8 (Fig. 2B).
The RFX6 protein consists of four functional domains: a DNA-binding domain plus three domains that mediate dimerisation (Fig. 2C) (Aftab et al., 2008). The RFX6 c.1129C>T mutation is predicted to be a nonsense mutation that produces a truncated protein (p.Arg377Ter) lacking the core C-terminal dimerisation domain. To determine whether the RFX6 c.1129C>T allele generates a partial protein product, we measured the levels of wild-type and p.Arg377Ter RFX6 by western blotting (Fig. 2D). We failed to detect any RFX6 protein species in MRS2-6 on day 10 of differentiation, which provides strong evidence that c.1129C>T is a bona fide RFX6 loss-of-function allele. Hereafter, individuals homozygous for the c.1129C>T mutation are referred to as RFX6−/− and carriers as RFX6+/−.
RFX6 is required for efficient differentiation of pancreatic endoderm
To determine whether the MRS phenotype is reproduced in vitro, we compared gene expression between cells derived from healthy subjects and individuals with MRS. As anticipated from the clinical presentation of MRS, patient-derived iPSCs were able to generate definitive endoderm and PGT normally (Fig. 3A,B). However, beginning on day 8, loss of RFX6 resulted in significantly reduced expression of PDX1 and SOX9, the co-expression of which identifies the multipotent, proliferative progenitors of all three pancreatic lineages (reviewed in Kawaguchi, 2013). Indeed, colocalisation of PDX1 and SOX9 is near ubiquitous in H9 cells on day 10 of differentiation, but in RFX6−/− iPSCs, large areas of PDX1−;SOX9− cells are observed, suggesting an overt loss of pancreatic identity (Fig. 3C).
Co-expression of the transcription factors PDX1 and NKX6-1 defines a population of multipotent pancreatic progenitors poised to differentiate into endocrine cells (Russ et al., 2015). We therefore sought to quantify the effects of RFX6 loss on these pancreatic progenitors. The differentiation protocol that we employ typically generates >80% PDX1+;NKX6-1+ pancreatic progenitors (Figs. 3D,E and Trott et al., 2017). Loss of a single copy of RFX6 substantially reduces the proportion of NKX6-1+ cells, leaving large numbers of singly PDX1-positive cells, whereas loss of both copies further reduces the proportion of both PDX1+ and NKX6-1+ cells (Fig. 3D,E). This suggests that RFX6 is not absolutely required to generate PDX1+ pancreatic endoderm or endocrine-competent PDX1+;NKX6-1+ progenitor cells, but that it controls the efficiency of differentiation in a dose-sensitive manner. These observations are consistent with the MRS phenotype in which loss of RFX6 results in the absence of CHGA+ endocrine cells and pancreatic hypoplasia, but exocrine tissue still forms.
RFX6 maintains the transcriptional program that defines pancreatic endoderm
To elaborate the requirement for RFX6 to form pancreatic endoderm, we compared the transcriptomes of cells from patient III:4 and her heterozygous father on day 8 of differentiation, which coincides with peak RFX6 expression and the onset of changes in gene expression (Figs 2B and 3A). Principal component analysis revealed that changes in gene expression were largely due to differences between the affected and unaffected individuals, rather than between iPSC lines or biological replicates (Fig. 4A). Lack of RFX6 leads to downregulation of 88 genes and upregulation of 309 genes (Fig. 4B; Tables S1 and S2). Analysis of gene ontology terms associated with genes whose expression depends on RFX6 identified a single enriched term, ‘Pancreas development’ (Fig. 4C). Conversely, genes upregulated in the absence of RFX6 were associated with development of mesoderm derivatives: specifically, heart, skeleton and muscle (Fig. 4C). Interestingly, these genes were often involved in morphogenesis or formation of the extracellular matrix. These observations suggest that RFX6 has dual functions: to initiate expression of genes required for pancreatic development and to maintain pancreatic identity by blocking activation of genes that direct formation of other lineages.
To characterise the effects of RFX6 function on the pancreatic transcriptional program, we analysed the expression of genes with known roles in mammalian pancreas development (Fig. 4D). Several genes known to be expressed in pancreatic endoderm were strongly downregulated in MRS cells (e.g. PDX1, SOX9, PROX1, ONECUT1 and ONECUT2), alongside the endocrine markers NEUROG3 and INSM1. Conversely, genes with roles in both definitive and pancreatic endoderm development (GATA4/6, FOXA2 and HNF1A, etc.) tended to be unaffected by loss of RFX6. Furthermore, although SOX17 expression is normally downregulated following PGT differentiation, SOX17 levels remained elevated at day 8 in MRS cells, suggesting a delay in further differentiation in the absence of RFX6. We next examined the expression of genes that mark the CS12-14 dorsal pancreatic bud, but not the nearby hepatic cords (Fig. 4E) (Jennings et al., 2017). Of these genes, 15 are significantly downregulated in MRS patient cells: PDX1, SOX9, NETO1, ONECUT1, NPY1R, FAM84A (LRATD1), PRDM16, RAP1GAP2, VWA5B2, RP11-351J23.1, GOLGA6L9, PAQR5, NRG3, MST1L and TLE2. When taken together, these observations suggest that RFX6 is specifically required for PFG cells to differentiate into pancreatic endoderm and identifies several candidates that influence pancreatic endoderm differentiation downstream of RFX6.
Using CRISPR/Cas9 gene editing, we introduced a triplicated HA epitope into the 3′ end of the RFX6 gene in H9 cells (designated RFX6HA/HA) (Fig. 5A, Fig. S3A-D). Pluripotent RFX6HA/HA cells were then differentiated into day 8 pancreatic progenitors, and endogenous RFX6-HA detected by immunofluorescence (Fig. S3E). Importantly, we observed no effect of the HA tag on the production of pancreatic progenitors, indicating that the addition of the HA tag does not interfere with RFX6 function (Fig. 5B). To eliminate background signal caused by non-specific antibody binding, a control experiment using wild-type H9 hESC was performed in parallel.
To identify direct transcriptional targets of RFX6, we performed chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-seq). We identified 1350 regions bound by RFX6 that were not enriched in control experiments using wild-type H9 cells (Fig. 5C; Tables S3 and S4). These sites were typically located more than 1 kb away from the transcription start sites of the nearest gene, in introns or intergenic regions (Fig. 5D,E), suggesting RFX6 does not usually act by binding to gene promoters. Transcription factor target motif analysis identified an RFX6 target binding sequence that is similar to, yet distinct from, sequences bound by RFX1 and RFX3 (Fig. 5F). In order to determine which differentially expressed genes are direct targets of RFX6, we identified those that are closest to an RFX6-binding site. Of 53 putative direct targets of RFX6, in the absence of RFX6, 10 genes are downregulated (HHEX, CDHR3, ATOH1, RBFOX1, CPM, CASC18, BCHE, RALYL, RAB3C and KCNJ3), while 43 are upregulated (Table S5). Interestingly, RFX6 is present at distal 3′ enhancers of HHEX and ISL1 (Fig. S4), with the murine Hhex and Isl1 homologs well known for their crucial roles in early pancreatic morphogenesis in mice (Ahlgren et al., 1997; Bort et al., 2004).
Pancreatic endoderm cells lacking RFX6 arise during normal development
iPSCs from individuals with MRS are capable of forming both pancreatic endoderm and PDX1+;NKX6-1+ pancreatic progenitors, although loss of RFX6 substantially reduces the efficiency of differentiation (Fig. 3B-E). This suggests that RFX6 is not absolutely required for pancreatic endoderm differentiation, but affects the efficiency of the process. However, it is unclear whether pancreatic endoderm cells lacking RFX6 expression are observed during normal development, or whether all PDX1+ pancreatic endoderm cells express RFX6. Given the lack of reliable antibodies that recognise human RFX6, we performed immunofluorescence staining using the RFX6HA/HA H9 hESC line at different stages of pancreatic differentiation (Fig. 6A and Fig. S5A). Differentiation cultures composed largely of PDX1+ pancreatic endoderm contain a mixture of RFX6-HA+ and RFX6-HA− cells. Specifically, 30% of PDX1+ pancreatic endoderm cells lack RFX6-HA protein (Fig. 6B). However, PDX1−;RFX6-HA+ cells are seldom observed.
To determine whether RFX6− pancreatic endoderm also arises in vivo, we performed high resolution in situ hybridisation for RFX6 on human embryos. Between CS12 and CS19, PDX1 is expressed ubiquitously in the multipotent progenitor cells of the pancreatic bud (Fig. S5B-D and Jennings et al., 2013). As observed in vitro, RFX6 is expressed in the majority of PDX1+ cells but RFX6− cells are present throughout the nascent pancreatic bud at CS14 and later in the fused bud at CS17 (Fig. 6C). A similar pattern is observed at 8 weeks post-conception (8 wpc), around the time the endocrine lineage becomes segregated from ductal cells (Fig. 6C; Fig. S5) (Jennings et al., 2015). These observations are consistent with studies in mice, in which PDX1+RFX6− cells are observed throughout the embryonic day 10 dorsal and ventral pancreatic buds (Smith et al., 2010; Soyer et al., 2010). When taken together, these observations suggest that human pancreatic endoderm consists of PDX1+ cells with and without RFX6 expression.
Although the mouse and human pancreas function similarly, with isolated human islets able to impart normoglycemia in diabetic mice, there are prominent anatomical, transcriptional and physiological differences between the two species. The developmental origins of these differences have been difficult to interrogate due in part to the inaccessibility of human foetal material. Our present study begins to fill this knowledge gap. Using micro-CT imaging, we chronicled human pancreatic development between CS13 and CS23, corresponding to roughly 30 days of gestation, with eight intact human embryonic specimens. From these images, sequential three-dimensional reconstructions were assembled that provide high-resolution temporal and spatial insight into the complex growth and morphogenesis of the dorsal and ventral pancreatic buds that coalesce to form the pancreatic anlagen (Fig. S1 and see 3D interactive PDF in the supplementary information). This timeline further allowed us to better understand the embryological consequences of homozygous RFX6 loss. At 12 weeks, the RFX6−/− foetus (III:4) lacks the pancreas body and tail – a phenotype that presages the overt pancreatic hypoplasia and neonatal diabetes observed in patients III:1, III:2 and III:3 (Fig. 1A; Fig. S1B and Movie 1). In contrast, Smith et al. (2010) reported that the major defect in Rfx6-deficient mice – i.e. the presence of Chga+ endocrine cells that lack hormones – is confined to the islets, with only incompletely penetrant, but poorly described hypoplasia (Smith et al., 2010). This apparent species-specific difference in the role of Rfx6/RFX6 during mammalian pancreatic development motivated our derivation of iPSC from patients III:3 and III:4 in an effort to model the embryological and molecular origins of MRS in vitro.
We deployed a robust pancreatic differentiation protocol based on the work of Rezania et al. (2014) to identify the specific stage at which RFX6 loss results in the prominent pancreatic hypoplasia observed in the MRS pedigree (Fig. 1A) (Rezania et al., 2014). Molecular analyses at the mRNA and protein level demonstrate that the formation of both the definitive endoderm as well as early PFG proceeds normally. However, upon the provision of signals that impart pancreatic identity (e.g. retinoic acid), RFX6-deficient cells fail to robustly activate the cardinal pancreatic progenitor marker PDX1. This result was also independently observed by Zhu et al. in H1 cells in which RFX6 loss-of-function alleles were engineered using CRISPR/Cas9 gene editing of exon 3 (Zhu et al., 2016). Considering that PDX1 and RFX6 are activated almost synchronously, the simplest explanation for this result is that RFX6 directly regulates PDX1 expression. However, our RFX6-HA ChIP-seq did not identify PDX1 as a potential RFX6 target (Fig. 5.; Table S5). A brief report by Cheng et al. (2019) who performed RFX6 ChIP-seq on the adult mouse pancreas did show slight enrichment within one region of the PDX1 promoter (Cheng et al., 2019). In silico analysis with MatInspector predicts three potential X-box motifs with modest matrix similarity scores within the PDX1 gene. Further experiments are therefore necessary to establish whether PDX1 and RFX6 form a feedback loop during early human pancreatic development.
Much of our understanding of the gene regulatory networks governing human pancreatic development comes from studies in the mouse. For example, it was established more than 25 years ago that Pdx1 is absolutely required for the formation of the murine pancreas, as Pdx1-deficient neonates entirely lack all three pancreatic lineages and succumb to extreme hyperglycaemia within a few days of birth (Ahlgren et al., 1996; Jonsson et al., 1994; Offield et al., 1996). Rare individuals with homozygous or compound heterozygous PDX1 mutations were subsequently identified who also displayed pancreatic agenesis (Stoffers et al., 1997; https://www.omim.org/entry/600733). The MRS pancreatic phenotype is less severe than loss of PDX1, hypoplasia versus agenesis, which suggests a simple epistatic relationship in which PDX1 is upstream of RFX6. In support of this hypothesis, we and others previously identified PDX1 binding sites in the human RFX6 promoter using ChIP-seq (Teo et al., 2015; Wang et al., 2018), and PDX1 and RFX6 display similar expression kinetics between days 7 and 8 of directed differentiation into the pancreatic lineage (Fig. 3A). Interestingly, however, not every PDX1+ cell co-expresses RFX6 both in vitro and in vivo (Fig. 6A-C). How this segregation is achieved developmentally and in vitro is at present unknown, but this observation suggests that PDX1+;RFX6+ and PDX1+;RFX6− cells give rise to distinct lineages (or sub-populations) in the adult pancreas. Loss of RFX6 in vitro results in an approximately fourfold reduction in PDX1+;NKX6-1+ pancreatic progenitors (Fig. 3D,E), a result consistent with the rare CHGA+ cells observed post-mortem in individuals with MRS (Mitchell et al., 2004). It is, however, important to emphasise that, although reduced, acinar and ductal tissue is present in individuals with MRS (Mitchell et al., 2004). This raises the possibility that PDX1+;RFX6− cells principally contribute to the non-endocrine component of the adult pancreas. In addition, our micro-CT imaging of the 12-week-old RFX6−/− foetus reveals the presence of the pancreas head, which is derived from the ventral pancreatic bud, but the absence of the body and tail, which is derived from the dorsal pancreatic bud (Fig. 1C). These pathological observations raise an alternative possibility whereby PDX1+;RFX6− cells and their descendants contribute predominantly to the pancreas head, while PDX1+RFX6+ cells form the body and tail. These hypotheses could be tested either by the generation of sophisticated reporter alleles in pluripotent cells (Bao et al., 2019) coupled with extended in vitro differentiation into regionally patterned pancreatic organoids or by conventional lineage tracing in mice because, as in human, not all Pdx1+ cells within the embryonic day 10 pancreatic bud are Rfx6+ (Smith et al., 2010; Soyer et al., 2010). Last, the number of PDX1+;NKX6-1+ pancreatic progenitors is sensitive to RFX6 gene dose in vitro (Fig. 3D,E). Thus, one simple explanation for the recent report of MODY in three generations of a Japanese family harbouring the same RFX6 allele described here (c.1129C>T) is a reduction in β-cell mass (Akiba et al., 2019).
Although studies of the RFX family of transcription factors during embryonic development are limited, RFX6 was previously shown in vitro to bind X-box sequences to activate expression of the insulin gene (INS) in the human β-cell line Endo-C-βH2, as well as the glucokinase (Gck) and ATP-binding cassette subfamily C member 8 (Abcc8, also known as Sur1) genes in the Min6 mouse insulinoma cell line (Chandra et al., 2014; Piccand et al., 2014). However, RFX6 is unable to activate transcription directly, as it lacks an activation domain (Aftab et al., 2008), but has been shown to dimerise with RFX2 and RFX3, which do contain such domains (Rual et al., 2005; Smith et al., 2010). Rfx3 also binds the Gck promoter in Min6 cells (Ait-Lounis et al., 2010). These in vitro findings suggest that Rfx3 and Rfx6 commonly co-regulate target genes. Additional support for this relationship comes from genetic studies in mouse: the phenotype of Rfx3 null mutant mice mirrors Rfx6 loss, although less severe, as both mutants produce endocrine cells that fail to express hormones (Ait-Lounis et al., 2010, 2007; Piccand et al., 2014; Smith et al., 2010). Significantly, RFX2 and RFX3 are co-expressed in the developing human pancreas (Cebola et al., 2015; Jennings et al., 2017) and during pancreatic differentiation in vitro (Fig. S6). PDX1 ChIP-seq studies also confirm that RFX3, like RFX6, is a PDX1 target gene during pancreatic differentiation in vitro (Wang et al., 2018). From these observations, one simple model emerges whereby specification of pancreatic identity in the developing human embryo – i.e. the onset of PDX1 transcription – is tightly followed by the activation of RFX2, RFX3 and RFX6, which form transcriptional complexes that activate (or repress) target genes that orchestrate the development of the pancreas body and tail.
We identified two such targets bioinformatically – the transcription factor genes ISL1 and HHEX – by combining comparative transcriptomics and ChIP-seq (Fig. 5 and Fig. S4). In the mouse, homozygous inactivation of Isl1 results in dorsal pancreatic bud agenesis (Ahlgren et al., 1997), whereas loss of Hhex leads to a complete failure of ventral pancreatic specification (Bort et al., 2004). Both mutations are lethal during mid-gestation, with pancreatic phenotypes much more severe than loss of Rfx6 (Piccand et al., 2014; Smith et al., 2010). These genetic studies therefore suggest that Rfx6 is not absolutely required for the early transcriptional activation of Isl1 and Hhex in the mouse, even though these three genes show overlapping expression patterns during early pancreatic development. This observation perhaps reflects the temporally distinct requirements between mouse Rfx6 and human RFX6 for pancreatic development emphasised by this study and previous work describing significant transcriptional differences between the mouse and human pancreas (Jennings et al., 2017). Whether human RFX6 directly regulates ISL1 and HHEX is also an unresolved issue, which can be addressed with in vitro reporter assays. To our knowledge, human pluripotent cell lines lacking either ISL1 or HHEX have not been generated. It would therefore be interesting to derive such lines and subject them to pancreatic differentiation and to assess their phenotypes in comparison with loss of RFX6. Comparative RNA-seq and ChIP-seq studies like those described here at key stages of in vitro differentiation for ISL1 and HHEX would additionally contribute to the construction of the regulatory ‘wiring diagram’ underlying early human pancreatic development.
Last, RFX6 is expressed in the insulin-secreting β-cells of the adult human islet (Chandra et al., 2014). Interestingly, recent genome-wide association studies for type 2 diabetes identified loci that are specifically enriched in RFX-binding motifs (Varshney et al., 2017). Our RFX6HA reporter allele coupled with extended differentiation protocols that efficiently produce ‘β-like’ cells in vitro (e.g. Nair et al., 2019) will provide a platform to formally identify those disease loci bound by RFX6. In the future, we intend to exploit RFX6−/− iPSC and the RFX6-HA-tagged line to further elaborate the function of RFX6, both at the level of the transcriptome and the proteome, in other endodermal lineages, including the lung and intestine, where its role is poorly described.
MATERIALS AND METHODS
Saliva samples were collected from parents and siblings (II:1, II:2 and III:5), and skin tissues from one affected daughter (III:3) and one affected foetus (III:4). Chorionic villus sampling was performed for prenatal diagnosis when the mother was pregnant with III:4. A skin biopsy was also collected from the healthy father (II:1). Genomic DNA (gDNA) was extracted using DNeasy Blood sample and Tissue Kits (Qiagen). All human biological materials were obtained after the parents gave their informed written consent (obtained and witnessed by co-authors Drs Fawaz Al-Kazaleh and Mohammad Shboul, respectively) and the local ethics commission gave its approval. The derivation of iPSC from the pedigree in Fig. 1A and micro-CT imaging of the CS14 foetus (III:4) are described in detail in the supplementary Materials and Methods.
For proband (III:3) whole-exome sequencing, 1 µg of purified genomic DNA was subjected to exome capture using Illumina TruSeq Exome Enrichment Kit (Illumina). Illumina HiSeq2500 high output mode was used for sequencing as 100 bp paired-end runs at the UCLA Clinical Genomics Centre. Sequence reads were aligned to the human reference genome (Human GRCh37/hg19 build) using Novoalign (v2.07, http://www.novocraft.com/main/index.php). PCR duplicates were identified by Picard (v1.42, http://picard.sourceforge.net/) and Genome Analysis Toolkit (v1.1, http://www.broadinstitute.org/gatk/) (McKenna et al., 2010) was used to re-align indels, recalibrate the quality scores, and call, filter, recalibrate and evaluate the variants. SNVs and INDELs across the sequenced protein-coding regions and flanking junctions were annotated using Variant Annotator X (VAX), a customised Ensembl Variant Effect Predictor (Yourshaw et al., 2015). Each variant was annotated with information including gene names and accession numbers, reference variant, variant consequences, protein positions and amino-acid changes, conservation scores, population minor allele frequencies and expression pattern, as well as PolyPhen2 (Adzhubei et al., 2010), SIFT (Kumar et al., 2009) and Condel (González-Pérez and López- Bigas, 2011) predictions. Almost 20,000 variants were identified across the RefSeq protein-coding exons and flanking introns (±2 bp). Of these, 37 homozygous and 119 compound heterozygous variants were protein-changing variants with population minor allele frequencies of less than 1%. Out of the 37 homozygous variants, only three passed quality filters, were included in regions of homozygosity, were not seen in population and were predicted to be deleterious, including one stop-gained mutation in the RFX6 gene.
Cell culture and differentiation
Human pluripotent stem cells were seeded onto Matrigel (Corning, 354277)-coated plates and cultured in mTeSR 1 medium (STEMCELL Technologies, 85850). Cells were passaged at a 1:20-1:25 split ratio every 4 days. For passaging, Gentle Cell Dissociation Reagent (STEMCELL Technologies, 07174) or ReLeSR (STEMCELL Technologies, 05872) was used. For differentiation into pancreatic progenitors, we employed the STEMdiff Pancreatic Progenitor kit (STEMCELL Technologies, 05120) with the following modifications: (1) cells were initially seeded onto 12-well plates (Corning, 353043) at a density of 106 cells/well; (2) stage 1 was extended to 3 days by repeating the final day's treatment; and (3) stage 3 was shortened to 3 days. All tissue culture was carried out in 5% CO2 at 37°C.
Cells were lysed in RIPA buffer (ThermoFisher Scientific, 89901) supplemented with protease inhibitors (Calbiochem). Protein concentrations of lysates were measured by Bradford assay and equal quantities of protein were loaded on precast 7% or 10% SDS-polyacrylamide gels (Bio-Rad). Proteins were transferred to PVDF membranes using the BioCraft (BE-300) transfer system. Membranes were blocked for 1 h at room temperature in blocking buffer [5% non-fat milk in 1× Tris-buffered saline (TBS)+0.1% Tween-20] then incubated with primary antibody in dilution buffer (1% non-fat milk in TBS+0.1% Tween-20) overnight at 4°C. Membranes were washed three times with wash buffer (TBS +0.3 Tween-20) at room temperature for 15 min each. Membranes were then incubated with secondary antibody at room temperature for 1 h. Primary antibody against RFX6 (R&D, AF7780, 1/500) was recognised by the secondary antibody goat anti-sheep IgG HRP conjugate (R&D, HAF016, 1/3000). Western blots were detected using SuperSignal West Pico or Dura Chemiluminescent Substrate (ThermoFisher Scientific) and imaged on X-ray film (GE Healthcare) by using X-Ray processor (Carestream-200).
Immunofluorescence was carried out essentially as previously described by Trott et al. (2017). Briefly, adherent cells were washed twice with PBS, fixed in 4% paraformaldehyde for 20 min at room temperature, washed three times with PBS+0.1% BSA, and then incubated with blocking buffer (PBS+20% normal donkey serum+0.1% BSA+0.3% Triton X-100) for 1 h at room temperature. Samples were then incubated overnight at 4°C with primary antibodies diluted in blocking buffer. After washing three times with PBS+0.1% BSA for 15 min, samples were incubated at room temperature for 1 h with secondary antibodies diluted 1:500 in blocking buffer. All subsequent steps were carried out in the dark. After washing three times with PBS+0.1% BSA for 15 min, samples were incubated at room temperature for 15 min with 2 μg/ml Hoechst-33342 (Thermo Fisher Scientific, 62249) diluted in PBS. Finally, samples were washed twice with PBS for 15 min and imaged. Primary antibodies were recognised by Alexa-fluorophore-conjugated secondary antibodies raised in donkey. Primary and secondary antibodies used for immunofluorescence staining are listed in Table S6. Images were acquired using an Olympus FV1000 inverted confocal microscope. The protocol we used to segment cells and extract staining intensities using FIJI is described at https://www.youtube.com/watch?v=82N-eIPqnwM (Fig. 6B). The FACS-style gating is based on staining with secondary antibodies alone. Three fields of view were selected from three independent differentiation experiments.
Single cells were obtained from human pluripotent stem cells or their differentiated derivatives at selected time points following Accutase treatment (Thermo Fisher Scientific, 14190), washed once with PBS+1% serum, fixed in 4% paraformaldehyde for 10 min at room temperature, and washed once with wash/permeabilisation buffer (WPB) (Becton Dickinson, 554723). Up to 106 cells were incubated with primary or isotype control antibodies diluted in 250 μl WPB. For unconjugated antibodies, cells were washed once with WPB then incubated for 15 min with secondary antibodies diluted in WPB. If staining for a second antigen, cells were washed once with WPB then subjected to the aforementioned incubation step(s). After washing once with WPB, cells were resuspended in PBS+1% serum and analysed using a BD FACSCalibur flow cytometer. All steps were carried out at room temperature. Cells were pelleted by centrifugation at 3500 g for 5 min in a microcentrifuge. Antibodies and their respective dilutions are listed in Table S7.
Cells were grown in 12-well plates for total RNA isolation. Three wells were harvested per sample to obtain technical triplicates. The RNeasy Mini Kit (Qiagen, 74104) was used in conjunction with the QIAcube (Qiagen, 9001292) for total RNA extraction. Cell culture media was aspirated and the cells were washed once with D-PBS. After removal of D-PBS, cells were lysed directly in the 12-well plates by adding 350 μl of RLT buffer. Cell lysates were transferred to 2 ml tubes and were either frozen at −80°C or used immediately with the QIAcube for RNA extraction. Each sample was treated with RNase-Free DNase (Qiagen, 7924) to avoid DNA contamination. RNA was eluted in a volume of 30 μl and frozen at −80°C or immediately taken to the next step of first-strand cDNA synthesis. If RNA samples were frozen, they were thawed on ice to prevent degradation. 400 ng of RNA was adjusted to a volume of 20 μl with nuclease free water and then reverse transcribed to generate cDNA using a high-capacity reverse transcription kit and random hexamer primers (Applied Biosystems, 4368814). The reaction tubes were subjected to PCR (10 min at 25°C for the primer annealing step, 120 min at 37°C for the extension step, and finally 5 s at 85°C for the inactivation of the enzyme). The resulting cDNA was diluted to a final volume of 200 μl with nuclease-free water prior to use for qRT-PCR. Quantitative RT-PCR was carried out using SYBR Select Master mix (Applied Biosystems, 4472908). Primers are listed in Table S8.
RNA was isolated from samples harvested from day 8 differentiation cultures using the RNeasy mini kit (Qiagen, 74104). All RNA samples had an RNA integrity number greater than eight. The mRNA was purified from total RNA using poly-T oligo-attached magnetic beads. RNA-seq libraries were generated using the NEBNext Ultra RNA Library Prep Kit, (NEB, E7530L). Messenger RNA was enriched by hybridisation to oligo-dT. RNA-seq libraries (200-300 bp insert cDNA library) were sequenced on a NovoSeq 6000 generating paired-end 150 bp reads. Table S1 contains metadata for all seven samples. Raw RNA-seq read files are have been deposited in ArrayExpress under accession number E-MTAB-9243.
Raw reads were filtered to remove reads containing: (1) adaptors, (2) greater than 10% undetermined bases, (3) more than 50% of bases with a Q score (Quality value) less than six. Reads were mapped using STAR (v2.5) and an index based on the soft masked primary assembly of reference genome GRCh37 and corresponding gene annotation file. HTSeq (v0.6.1) was used to count the number of reads mapped to each gene (gene counts). Differential gene expression analysis was conducted using DESeq2 (v1.22.2) and untransformed data. PCA plots (DESeq2, plotPCA) and heatmaps (pheatmap) were produced using Rlog normalised data (DESeq2). Genes expressed in the dorsal pancreas of CS12-14 human embryos were identified using a published dataset (Jennings et al., 2017). Dorsal pancreas genes are defined as those expressed at more than twice the levels of the hepatic cords (LogFC>1, FDR<0.05).
Chromatin immunoprecipitation sequencing
The derivation and clonal expansion of the triplicated HA RFX6 reporter line generated by CRISPR/Cas9 gene editing coupled with homologous recombination (Fig. 4 and Fig. S3) are described in detail in the supplementary Materials and Methods. All gRNA sequences as well as the T7E1 assay, sequencing and genotyping primers are listed in Table S9. Day 8 RFX6HA differentiation cultures were harvested and processed for chromatin immunoprecipitation (ChIP) as described in the manufacturer's instructions for the SimpleChIP Plus Enzymatic Chromatin IP Kit (Magnetic Beads) (Cell Signaling Technology, 9005). Chromatin was immunoprecipitated using rabbit anti-HA immunoglobin (Cell Signaling Technology, 3724). Library preparation was performed using the NEBNext UltraTM II DNA Library Prep Kit for Illumina (E7645). Sequencing was carried out using SE76 on an Illumina NextSeq. Sequencing quality was assessed using FastQC (v0.11.4). Raw sequencing files are available for download at ArrayExpress under accession number E-MTAB-9335.
Reads were aligned to the Ensembl hg19 genome assembly (GRCh37 release 99) using STAR (2.7.1a) with specific options for genomic DNA sequencing (‘--alignIntronMax 1 --alignEndsType EndToEnd’). Uniquely mapping aligned reads from individual sequencing lanes were merged for each sample and duplicate reads were marked and counted using SAMBAMBA (v0.6.6). To assess ChIP sequencing enrichment strength, CHANCE analysis (v2.0) was performed on each ChIP sample with the corresponding input sample as the input control. MACS2 (v2.2.6) peak calling was performed on each ChIP sample with the corresponding input sample as the background. Peaks were filtered for ENCODE blacklisted genomic regions using bedtools intersect (v2.28.0) to remove peaks overlapping with high complexity and artefact regions (https://github.com/Boyle-Lab/Blacklist/blob/master/lists/hg19-blacklist.v2.bed.gz). After filtering, called peaks in H9 ChIP and RFX6-HA ChIP samples were merged and used for differential enrichment analysis. To identify RFX6-HA-specific peaks, differential enrichment analysis was performed on the merged peaks with RFX6-HA ChIP as the target and H9 ChIP as the background using HOMER getDifferentialPeaks (v4.9) to identify differential peaks with fold change ≥4.0 and P≤0.0001 (‘-F 4.0 -P 0.0001’). Genomic feature association, gene association and ChIP signal analyses were performed on the RFX6-HA-specific peaks using HOMER annotatePeaks.pl (v4.9). De novo motif discovery analysis was performed using HOMER findMotifsGenome (v4.9) fasta-based analysis using the RFX6-HA-specific peaks as the target and the H9-specific peaks as the background. Discovered motifs were matched to known transcription factor target motifs in the JASPAR database. Overlap between RFX6-HA-specific peaks and DNaseI hypersensitivity was performed using bedtools intersect with the ENCODE DNaseI Hypersensitivity Site Master List (125 cell types) (tableName: wgEncodeAwgDnaseMasterSites).
Analysis of human foetal tissue
The collection, use and storage of human embryonic and foetal tissue was carried out with ethical approval from the North West Research Ethics Committee, under the codes of practice issued by the Human Tissue Authority and legislation of the UK Human Tissue Act 2008. Details on collection and handling are as described previously (Jennings et al., 2013). In brief, human embryos and foetal pancreas were fixed within 1 h in 4% paraformaldehyde under RNAase-free conditions, processed and embedded in paraffin wax for sectioning at 5 μm intervals. In situ detection of RFX6 and PDX1 RNA transcripts was carried out on paraffin wax-embedded tissue sections using RNAScope [Advanced Cell Diagnostics (ACD)]. Sections were pre-treated using an extended protease treatment and hybridised under conditions as described (RNAScope Sample Preparation and Pretreatment Guide) using automated RNAScope probes for RFX6 and PDX1 and standard negative dapB and positive PPIB control probes. Detection was by RNAScope LS Multiplex Reagent Kit (ACD; 322800) for the Leica Bond RX autostainer. Immunohistochemistry was performed on 5 μm sections as described previously (Jennings et al., 2013) using a guinea pig anti-PDX1 primary antibody (Abcam ab47308, 1/500). For further details, see supplementary Materials and Methods.
We first acknowledge the generosity of the Syrian family, living as refugees in Jordan, who endured the repeated tragedy of Mitchell-Riley Syndrome and who chose to selflessly share their loss in the belief that science would, in the future, prevent their experience. We thank Brigid Hogan for her encouragement and feedback on the manuscript, former members of the IMB Stem Cell Bank who were involved in the initial derivation of the iPSC lines – Michelle Eio, Grace Selva Raj and Puck Wee Chan – as well as Norihiro Tsuneyoshi and Sheena Ong, who provided advice and technical assistance.
Software: L.B.T.; Formal analysis: V.T., P.-A.G., J.H., J.G., S.R.L., E.G.; Investigation: J.T., Y.A., E.K.T., Y.D., M.E., H.W., S.E., G.N., S.J., S.W., J.S., A.K., S.M., R.J., A.E., H.L., S.F.N., H.S., N.H., B.S.D.B.; Resources: M.S., C.B., F.A.-K., M.E.-K., R.F., B.R.; Data curation: L.B.T.; Writing - original draft: J.T., N.R.D.; Writing - review & editing: J.T., N.R.D.; Supervision: N.R.D.; Project administration: N.R.D.; Funding acquisition: N.R.D.
J.T. was funded by a National Medical Research Council Award Young Investigator Research Grant (OFYIRG18May-0049). Y.A. was awarded a Singapore International Graduate Award by the Agency for Science, Technology and Research (A*STAR) Graduate Academy. N.R.D. and B.R. were provided with core funding by the A*STAR Institute of Medical Biology and were additionally supported by an Economic Development Board Singapore ‘Singapore Childhood Undiagnosed Diseases’ program grant (IAF311019) and an Agency for Science, Technology and Research Strategic Positioning Fund ‘Genetic Orphan Diseases Adopted: Fostering Innovation Therapy’ (GODAFIT) grant.
Raw RNA-seq read files are have been deposited in ArrayExpress under accession number E-MTAB-9243. Raw sequencing files are available for download at ArrayExpress under accession number E-MTAB-9335.
The authors declare no competing or financial interests.