During oogenesis, hundreds of maternal RNAs are selectively localized to the animal or vegetal pole, including determinants of somatic and germline fates. Although microarray analysis has identified localized determinants, it is not comprehensive and is limited to known transcripts. Here, we utilized high-throughput RNA-sequencing analysis to comprehensively interrogate animal and vegetal pole RNAs in the fully grown Xenopus laevis oocyte. We identified 411 (198 annotated) and 27 (15 annotated) enriched mRNAs at the vegetal and animal pole, respectively. Ninety were novel mRNAs over 4-fold enriched at the vegetal pole and six were over 10-fold enriched at the animal pole. Unlike mRNAs, microRNAs were not asymmetrically distributed. Whole-mount in situ hybridization confirmed that all 17 selected mRNAs were localized. Biological function and network analysis of vegetally enriched transcripts identified protein-modifying enzymes, receptors, ligands, RNA-binding proteins, transcription factors and co-factors with five defining hubs linking 47 genes in a network. Initial functional studies of maternal vegetally localized mRNAs show that sox7 plays a novel and important role in primordial germ cell (PGC) development and that ephrinB1 (efnb1) is required for proper PGC migration. We propose potential pathways operating at the vegetal pole that highlight where future investigations might be most fruitful.
For many organisms, oogenesis is a protracted affair during which RNAs and proteins are synthesized and stored for later use during embryogenesis. Patterning of the early Xenopus embryo is determined by these components and includes specification of the three axes (animal/vegetal, dorsal/ventral, left/right), the three primary germ layers, and the germ cell lineage (King, 2014). The animal/vegetal (A/V) axis is the first to be established and specifies where the three primary germ layers will arise in the embryo. Visible signs of A/V polarity are obvious in the stage I oocyte as the Balbiani body (BB) (or mitochondrial cloud) forms in close association with the nucleus and faces the future vegetal pole. The BB contains the maternal stockpile of mitochondria as well as the germline determinants embedded within germ plasm. Later in oogenesis, the BB components accumulate at the vegetal pole, becoming tightly associated with the subcortical region.
The identity of the maternal RNAs and proteins that participate in embryonic patterning, and thus normal development, are of great interest. Initial screens selecting mRNAs enriched at either pole identified both somatic determinants, such as vg1, vegT and wnt11, and germ cell determinants including nanos1, deadsouth, xdazl and xpat (Mowry, 1996; King, 2014; Aguero et al., 2016). These mRNAs defined two patterns of RNA localization during oogenesis that appeared to align with their embryonic functions: BB-localized RNAs that function in germline identity (early pathway); and RNAs that are uniformly distributed in stage I but vegetally localized during stages II-IV (late pathway) and function in somatic patterning. However, as the number of known localized RNAs increased, some were found that used both the early and late pathways (hermes, fatvg), or used the late pathway but, after fertilization, were found only in the germ plasm (dead-end). Loss-of-function studies revealed that these RNAs indeed have multiple functions important to both somatic and germ cell lineages (Houston, 2013).
How the asymmetric distribution of maternal RNA controls embryonic patterning represents a key area of research in developmental biology. Recent microarray data using cortical RNAs as probes have identified several hundred transcripts at the vegetal pole and many fewer localized at the animal pole (Cuykendall and Houston, 2010). Although microarrays have identified transcripts localized to the vegetal cortex and to germ plasm, this type of analysis is limited in sensitivity and to known transcripts. A comprehensive analysis identifying the RNAs, both coding and non-coding, that are significantly enriched at either the animal or vegetal pole is an important first step towards understanding the maternal contribution to embryonic patterning.
In the present study, we utilized high-throughput RNA-sequencing (RNA-seq) analysis to interrogate both animal and vegetal pole localized RNAs in the fully grown oocyte. We identified 411 vegetally localized mRNAs and, of those, 198 are previously identified genes currently in the Xenbase database (Karpinka et al., 2015). Analysis of vegetally enriched transcripts identified receptors, ligands, RNA binding proteins, protein modifying enzymes and transcription factors, as well as defined gene hubs. Functional analysis of key genes confirmed their roles in primordial germ cell (PGC) development. We also identified eight microRNAs (miRNAs), all uniformly distributed, suggesting that early embryonic patterning is not regulated by localized maternal miRNAs but rather their localized mRNA targets. Analysis of non-coding RNAs must await further annotation of the Xenopus tropicalis or laevis genome. Here, we present a comprehensive analysis of identified RNAs found enriched at either the animal or vegetal pole. Our findings strongly support the vegetal pole as a major signaling center that patterns the early embryo.
RNA-seq analysis of vegetal and animal poles
To identify transcripts localized at either the vegetal or animal pole, RNA was isolated from the respective poles (each comprising ∼10-20% of total oocyte) of stage VI X. laevis oocytes and subjected to RNA-seq analysis. A total of six samples, comprising three vegetal and three oocyte-matched animal poles, were included in the analysis as described in Materials and Methods. The total number of reads for all three samples of vegetal and matched animal poles were virtually identical, revealing sample precision (Fig. 1A). The reads were aligned to the version 7.1 X. laevis genome (Xenbase.org) and principal component analysis (PCA) was performed on the normalized results. Two-dimensional PCA showed that transcripts identified in the vegetal pole samples cluster together and away from the animal pole samples, which also cluster together (Fig. 1B). The identified transcripts having an FDR<0.05 and an FPKM≥5 were used to generate a scatter plot (Fig. 1C). The data support and extend previous analyses that show both a greater complexity and fold enrichment of RNAs at the vegetal pole in comparison to the animal pole (Cuykendall and Houston, 2010; De Domenico et al., 2015). As expected, mRNAs known to be localized at the vegetal pole (nanos1, dazl, ddx25/deadsouth) and animal pole [dand5 (coco) and slc18a2 (vmat2)] were identified as well as novel mRNAs. Over five thousand transcripts (5717) were found differentially expressed between the animal and vegetal poles based on a minimum q-value of 0.05 and 2-fold change.
Transcripts were considered significantly enriched at the vegetal pole if they had at least a 4-fold increase compared with the animal pole. The stringent criteria set yielded a total of 411 vegetally enriched transcripts, 198 of which were annotated (Fig. 1D, Table S1). Of the 198 transcripts identified, 38 have been shown to be vegetally localized, with 23 of them being specifically associated with germ plasm (Cuykendall and Houston, 2010; Claussen et al., 2015; De Domenico et al., 2015). Transcripts were considered localized at the animal pole if they were at least 10-fold enriched compared with the vegetal pole. Under these conditions, 27, including 15 annotated, mRNAs were enriched at the animal pole (Fig. 1D). All annotated and unannotated transcripts can be found in Tables S2 and S3, respectively.
Biological process and network analysis of vegetally enriched transcripts
To identify possible gene functions, the 198 annotated vegetal mRNAs were manually data mined using GeneCards (www.genecards.org). Table 1 shows the 40 most enriched transcripts in the vegetal pole. Ten categories were established based on function (Fig. 2A). The top six categories were: signal transduction (26%), transport (13%), transcription (7%), cytoskeletal related (8%), the ubiquitin pathway (7%) and cell cycle (7%). Enzymes are often key players in regulating gene pathways; therefore, we also identified and categorized the enzymes represented in our vegetally enriched data set. Enzymes represent 60/198 (30%) of localized transcripts. Nine categories of enzymes were identified, with kinases (18%), metabolism related (17%), ubiquitin pathway (13%) and ATPases/GTPases (12%) making up the majority (Fig. 2B).
Vegetally localized mRNAs were subject to gene pathway and network analysis by GeneGo. Significantly related gene ontology (GO) processes were grouped into seven categories based on gene expression (Table 2). Consistent with a role in embryonic patterning, these categories included: developmental processes, signaling regulation, localization, phosphate metabolic processes, cellular protein metabolic processes, cell cycle, and gamete generation. Interestingly, genes involved in neurogenic processes such as neuroblast proliferation, including fgfr2, frizzled1 and ephrinB1 (efnb1), were well represented in our data set, composing 12% of annotated genes (Table 2).
We next investigated potential gene networks present within the 198 vegetally enriched transcripts. Using MetaCore analysis (GeneGo), we identified 47 genes that form a direct interaction network (Fig. 2C). These genes encode protein-modifying enzymes [caspase 3 (casp3), cathepsin C (ctsc), pcsk6/pace4, tesk1/2, senp1], receptors [fgfr2, frizzled1 (fzd1), a2mr/lrp1], ligands (wnt11), and five key transcription factors or co-factors (e2f1, irf8, err1/esrra, p300/ep300 and sox7), the first four of which represent network hubs. These mRNAs were validated as vegetally localized by either RT-qPCR (Fig. 2D) or WISH (Fig. 3). Published studies on these factors suggest their involvement in regulating the cell cycle (e2f1), endoderm specification (sox7), metabolic pathways (err1) and lineage commitment (irf8) (Costa et al., 2013; Johansen et al., 2016; Minderman et al., 2016; Stovall et al., 2014).
miRNA target RNAs are localized at the vegetal pole
Germ plasm RNAs must be post-transcriptionally regulated for germline survival (Lai et al., 2012; reviewed by Lai and King, 2013). RNA degradation within germ plasm may be regulated by miRNAs (Bartel, 2004; Yamaguchi et al., 2014). Therefore, we mined our data to identify vegetally localized miRNAs. Our analysis identified only eight miRNAs that were expressed in both the vegetal and animal poles: 15c, 18a, 19b, 20a, 92a, 363, 427 and 429. Surprisingly, none was significantly enriched at either pole (data not shown).
We next determined if the predicted mRNA targets of the eight identified miRNAs were vegetally localized. Interestingly, predicted targets of 7/8 miRNAs are enriched in the vegetal pole (Table 3). In total, 13 vegetally localized target mRNAs (listed in Table 3) were identified that contain at least one recognition sequence conserved between X. tropicalis and human for their respective miRNAs. These results suggest that if early embryonic patterning is regulated by miRNA activity, it is not by localizing miRNAs to the vegetal pole but rather by targeting specific vegetally localized RNAs.
Expression of vegetally localized RNAs during development
We chose 17 transcripts (xpat, efnb1, rras2, mov10, otx1, sox7, spire1, wnk2, e2f1, sybu, atrx, hook2, tob2, rnf38, trank1, wwtr1 and parn) for WISH analysis to determine their expression pattern during embryogenesis (Fig. 3, Fig. S1A,B). Consistent with our RNA-seq data, 15/17 were expressed exclusively in the vegetal pole of stage IV oocytes (Fig. 3, Fig. S1A). Not surprisingly, the two low expressing mRNAs, parn and wwtr1 (only 5- to 6.7-fold enriched compared with the animal pole) were not detected in stage IV oocytes but were detected later at blastula stage (Fig. S1B). Recently, another group analyzed transcript localization in the 8-cell X. tropicalis embryo by RNA-seq (De Domenico et al., 2015). Comparison of the vegetal/animal blastomeres revealed that 27 of our filtered 198 transcripts remain vegetally enriched after both fertilization and cortical rotation have occurred (De Domenico et al., 2015).
Spatial expression patterns were examined during oogenesis, the pre-midblastula transition (MBT), gastrula, neurula, and tailbud stages to determine whether localized mRNAs contribute to the future germline, soma, or both lineages. During oogenesis, three mRNA localization patterns were detected: the early or METRO pathway (Fig. 3A), the late Vg1-like pathway [Fig. 3B, Fig. S1A (atrx, hook2, tob2), Fig. S1B], or both [Fig. 3C, Fig. S1A (rnf38)] (King et al., 1999). Germ plasm-specific xpat served as a marker for germline expression (Hudson and Woodland, 1998). Regardless of the pathway used, all mRNAs were subsequently found in the germ plasm of embryos as well as in the soma (Fig. 3, Fig. S1A,B). 12/17 RNAs represented novel germline components, while xpat, sybu, otx1, tob2 and efnb1 were confirmed as previously described (Cuykendall and Houston, 2010; De Domenico et al., 2015).
Except for sox7 and efnb1, germ plasm expression persisted through neurula (Fig. 3, Fig. S1A,B). During gastrulation, the germline segregates from endoderm and PGCs form a distinct lineage. By neurula, PGCs are transcriptionally active for the first time (Venkatarama et al., 2010). sox7 is expressed through gastrula stages but is lost by neurula suggesting an early role in PGC development. efnb1 is not expressed at gastrula but is re-expressed by neurula, probably as part of a new gene expression program in PGCs. PGC migration towards the dorsal mesentery and organogenesis occur at the early tailbud stage. Only 41% (7/17) of the vegetal transcripts, including xpat, remained expressed in PGCs during tailbud stage (Fig. 3). These transcripts included transcription factors (otx1, e2f1), RISC factor (mov10), actin regulator (spire1), Ser/Thr kinase (wnk2), and an adaptor protein that binds kinesin (sybu). They are likely to be zygotic transcripts and might play roles in migration and/or in preserving PGC totipotency.
In addition to PGC expression, 71% (12/17) are also expressed in the eye anlage and the future posterior region in neurula, including the somitogenic mesoderm (Fig. 3, Fig. S1A,B). In tailbud stages the most notable expression pattern was in neural regions including the eye, cranial ganglia, neural tube, nasal placodes, brain, otic vesicle and the intersegmental region between the somites. Vegetally enriched transcripts that have previously been shown to be involved in neural pathways included efnb1, sybu, wnk2 and otx1 (Colozza and De Robertis, 2014; Bovolenta et al., 2006; Rinehart et al., 2011; Zhang et al., 2015). Taken together, these data suggest that genes enriched at the vegetal pole of Xenopus oocytes contribute to both germline and neural specification during development.
Novel mRNAs enriched at the animal pole
We identified 15 mRNAs, six not previously reported, that were at least ten-fold enriched at the animal pole (Table 4). These mRNAs represent the following functional categories: signaling [dand5, ifrd2, slc18a2, spata13, acaca, tmem192, ssr1, prr11], gene expression (pou2f1), cell division (rmdn3) and metabolism (adpgk, prrg4, prkag1). Two mRNAs previously shown to be enriched at the animal pole were chosen for validation by WISH: slc18a2 and dand5 (Fig. S1C). dand5 is a TGFβ and Wnt antagonist (Eimon and Harland, 2001; Bates et al., 2013). Slc18a2 transports monoamines into secretory vesicles for eventual exocytosis (Nikishin et al., 2012). As expected, both animal pole transcripts were expressed exclusively in the animal pole of pre-MBT embryos (Fig. S1C). Animal pole transcripts are expressed primarily in the neural ectoderm at later developmental stages, as previously described (Fig. S1C) (Grant et al., 2014). MetaCore (GeneGo) direct interaction pathway analysis did not reveal integrated networks among the animal pole-enriched RNAs.
Overexpression of e2f1, otx1, parn, rras2 and wwrt1 significantly reduces PGC number
As a first step towards functional analysis, six vegetally enriched transcripts expressed in PGCs were selected for overexpression studies. One-cell embryos were injected in the vegetal region with in vitro synthesized mRNA of selected transcripts or GFP. Tailbud embryos were collected and the number of PGCs per embryo was calculated and compared with GFP-injected controls. Overexpression of the transcription factors e2f1 and otx1, the transcriptional co-activator wwtr1, a Ras-like small GTPase rras2, and the poly(A)-specific ribonuclease parn significantly reduced PGC number, whereas spire1 had no effect (Fig. 4). Importantly, aside from the effects on PGC number, embryos appeared normal. Taken together, these results suggest a specific role in PGC development for otx1, e2f1, wwtr1, rras2 and parn.
Embryos depleted of otx1 and wwtr1 are deficient in PGCs
To further test the function of otx1 and wwtr1, we created loss-of-function morphants by injection of antisense morpholinos (MOs) into one-cell embryos. Both otx1-MO and wwtr1-MO blocked the translation of their respective proteins in a dose-dependent fashion (Fig. S2B). Injected embryos were collected at tailbud stages and the number of PGCs per embryo was calculated and compared with the control. Inhibition of otx1 and wwtr1 significantly increased PGC number (Fig. 5).
Misexpression of sox7 reduces PGC number
The transcription factor sox7 has been shown to play various roles in embryonic development, including proliferation, differentiation, hematopoiesis, cardiogenesis and vasculogenesis (Stovall et al., 2014). However, its role in PGCs is unknown. sox7 expression is significantly upregulated in the vegetal compared with the animal pole in stage VI oocytes (Fig. 1C, Table 1) and its expression persisted in PGCs during gastrulation (Fig. 3B). To assess the role of sox7 in PGC development, fertilized embryos were injected vegetally with either a sox7-targeted MO (Fig. S3) or the dominant-negative transcript sox7dCEnR (Fig. 6). Overexpression and rescue experiments were performed using X. tropicalis sox7 (Xtsox7) mRNA (Zhang et al., 2005a). Initial injections of sox7dCEnR or Xtsox7 mRNAs were performed at various concentrations to determine an effective dose that would not cause the phenotypic alterations observed by Zhang et al. (2005a) (data not shown). No notable changes in morphology were observed in embryos injected with sox7-MO, sox7dCEnR (200 pg) and/or Xtsox7 (200 pg) mRNA (Fig. 6A-D, Fig. S3C,D). The number of PGCs per embryo was calculated and compared with uninjected controls. Both dominant-negative and MO-mediated inhibition and the overexpression of Xtsox7 significantly reduced the number of PGCs in tailbud embryos (Fig. 6, Fig. S3). Expression of both sox7dCEnR and Xtsox7 mRNAs together significantly rescued the effect that sox7dCEnR had on PGC number, presumably because Xtsox7 restored function (Fig. 6D,E). These data suggest that the level of sox7 expression must be tightly regulated for proper PGC development.
efnb1 plays an essential role in PGC specification and migration
Eph receptor tyrosine kinases and their ligands, ephrins, have been shown to be involved in the formation of tissue boundaries, including separation of the germ layers, by regulating migration, adhesion and repulsion during embryonic development (Rohani et al., 2014). However, their specific role(s) in germline development is unknown. Interestingly, efnb1 expression is upregulated in vegetal versus animal poles of stage VI oocytes. efnb1 is also expressed in PGCs at neurula (Fig. 3A), suggesting a role in this lineage. To assess the role of efnb1 in PGC development, efnb1 was overexpressed by injecting flag-tagged efnb1 (efnb1-FL) mRNA into the vegetal region of fertilized embryos. No notable changes in morphology or PGC location were observed (Fig. 7A). However, overexpression of efnb1 significantly reduced the number of PGCs compared with control (Fig. 7A). This effect is rescued by co-injection with efnb1-MO (Fig. 7A). These data suggest that PGC number is not maintained within an environment of excess Efnb1 protein.
We next assessed the effect of efnb1 inhibition using an efnb1-MO as described (Moore et al., 2004). No notable changes in morphology or PGC number were observed in embryos injected with efnb1-MO compared with scrambled-MO or uninjected controls (Fig. 7B). efnb1 inhibition significantly increased the total number of embryos containing mislocalized PGCs according to the boundaries designated for normal PGC localization, between somites 5 and 11, as described by Tarbashevich et al. (2011). Mislocalization was along the anterior/posterior (A/P) axis, primarily beyond the normal posterior boundary (Fig. 7B). PGC mislocalization was rescued by co-expression with an efnb1 mRNA construct containing conservative mutations in the MO-binding region (efnb1-FL-rescue), rendering the MO ineffective (data not shown), confirming the specificity of the effect of efnb1-MO (Fig. 7B). These data suggest that efnb1 is essential for the proper migration of PGCs.
p300 is required for normal PGC development
The transcriptional co-activator and histone acetyltransferase p300 has been shown to be expressed during oogenesis (Kwok et al., 2006) and to regulate the metabolic state of mammalian germ cells (Boussouar et al., 2014). p300 is enriched at the vegetal pole of stage VI oocytes (Fig. 2D) and represents one of the network hubs at the vegetal pole (Fig. 2C). We therefore examined whether p300 plays a role in Xenopus PGC development. Fertilized embryos were incubated with DMSO (control) or the p300 small molecule inhibitor C646 until they reached tailbud stages. Treated embryos were then collected, and the number of PGCs per embryo was calculated and compared with controls. Inhibition of p300 significantly reduced PGC number in a dose-dependent manner, suggesting a role in PGC development (Fig. 8).
Here we report the first interrogation of RNAs within the vegetal and animal poles by RNA-seq. WISH revealed all 17 selected mRNAs to be localized, providing strong support for the accuracy of the data sets. Our findings underscore the dramatic transcript asymmetry along the A/V axis and the importance of the vegetal pole in initiating somatic and germline lineages in the early embryo. Importantly, as the annotation of the Xenopus genome improves, our data set can be continually mined to identify spliced variants and currently unknown transcripts.
Six important observations have emerged from our studies. (1) We identified 90 novel mRNAs that were over 4-fold enriched at the vegetal pole and six that were over 10-fold enriched at the animal pole. (2) GeneGo analysis revealed a network encompassing over 20% of the annotated vegetally enriched mRNAs, indicating great connectivity of gene function and localization. Transcription factors/co-factors e2f1, irf8, err1 and p300 defined four regulatory hubs for future analysis (Fig. 2C). (3) Unlike mRNAs, localization of maternal miRNAs does not appear to be a strategy employed to regulate gene expression along the A/V axis. (4) Enzymes represented 30% of the 198 enriched annotated mRNAs, underscoring the vegetal pole as a major platform for cell signaling. (5) Well over 10% of the vegetal mRNAs encode components with known functions in neurogenic pathways, including the transmembrane ligand Efnb1 and scaffold protein Grip2. We show that efnb1 is required for proper PGC migration. (6) sox7, otx1, e2f1, wwtr1, rras2 and parn are also germ plasm components and required for normal PGC development.
Biological process and network analysis
MetaCore (GeneGo) analysis placed over 20% (47/198) of the known vegetal pole mRNAs within a direct interaction network that identified four major hubs centered around transcription factors e2f1, irf8, err1 and the histone acetyltransferase p300 (Fig. 2C). Although the network is built based on data from different systems, it reveals novel regulatory pathways and candidates that can be tested for their functions in embryogenesis. For example, it remains unknown what restricts microtubule array formation to the vegetal pole during cortical rotation. Our RNA-seq analysis identified JNK (mapk8) and slain1 as enriched at the vegetal pole and both have been implicated in microtubule dynamics (GeneCards).
e2f1 constitutes the largest hub by far, connecting 19 other localized mRNAs, including two additional hubs, err1 and p300, and genes involved in cell cycle regulation, DNA replication, pluripotency/differentiation, and metabolism (Fig. 2C). Overexpression of e2f1 resulted in a loss of PGCs (Fig. 4). This effect might be due to misregulation of the cell cycle and/or in initiating somatic differentiation. Consistent with this hypothesis, Zaragoza et al. (2010) showed that the balance of E2f transcription factors, E2f dimerization partners and C/EBPα is critical for proper cell cycle progression. E2f1 has also been shown to mediate proliferation through Wnt signaling by direct interaction with the Fzd1 promoter (Yu et al., 2013). Furthermore, E2f1 activates Pcsk6, which allows for mesoderm induction by activation of Veg1 (Heasman, 2006). E2f1 also activates Cbx7, a member of the Polycomb repressor PRC1-like complex that plays a pivotal role in the transition from pluripotency to differentiation by regulating Cbx8 and Fzd1 (Klauke et al., 2013; Creppe et al., 2014; Mani et al., 2008; O'Loghlen et al., 2015). Thus, E2f1 in excess may tip the balance towards somatic differentiation in PGCs, causing their loss. Because of the dominant position that e2f1 holds, we attempted to knockdown its activity by antisense MO injection into fertilized eggs (data not shown). Unfortunately, we could not detect a phenotype, most likely because of a pre-existing maternal supply of E2f1 protein (Peshkin et al., 2015). Future oocyte host transfer studies will investigate the function of E2f1 by depleting the maternal supply.
Interestingly, overexpression of rras2, parn, otx1 and wwtr1 also resulted in PGC loss, while MO-mediated inhibition of otx1 and wwtr1 caused an increase in PGC number. Previous studies involving these transcripts all mention their possible roles in cell cycle regulation, proliferation and/or survival, consistent with our observations. The deadenylase Parn mediates progression through G0/G1 by regulating p53 and p21 expression (Zhang and Yan, 2015). Rras2 has been implicated in cell proliferation by regulating the PI3K (Murphy et al., 2002) and ERK pathways (Larive et al., 2012). Similarly, Otx1 regulates proliferation through the ERK/MAPK pathway, and is necessary for progression through S phase (Li et al., 2016). Wwtr1(Taz) is involved in cell cycle progression, proliferation and survival by regulating cyclin A and Ctgf expression and Casp3 activity (Wang et al., 2014). Taken together, our working hypothesis is that overexpression results in cell cycle checkpoint abnormalities and cell death, whereas loss of function releases the restrained PGC cell cycle clock resulting in more PGCs. Further investigation of these known downstream targets of rras2, parn, otx1 and wwtr1 in PGCs is necessary to deduce the exact mechanisms by which these genes regulate PGC number.
Sox7 is required for PGC development
Here we show a novel role for sox7 in PGC development. sox7 is expressed in germ plasm, and its expression persists in PGCs after segregation from the endoderm lineage (Fig. 3B). Both knockdown and overexpression of sox7 in the fertilized embryo caused a significant decrease in the number of PGCs at the tailbud stage (Fig. 6, Fig. S3). These results strongly suggest that the level and temporal regulation of sox7 activity in PGCs are crucial to their normal development, most likely by activating the proper gene networks in the germline.
sox7 directly activates genes necessary for endoderm differentiation, including the nodal-related protein-encoding genes, and induces the expression of endodermin and mixer (Zhang et al., 2005a). Thus, overexpression of sox7 in PGCs may cause ectopic expression of these endodermal differentiation genes, resulting in apoptosis of PGCs and ultimately a reduction in their number. Additionally, the Wnt/β-catenin signaling pathway has been reported to promote stem cell self-renewal and cell survival (reviewed by Mohammed et al., 2016), two characteristics necessary to preserve the germline. Xlsox7 harbors the β-catenin-binding motif DRNEFDQYL (Guo et al., 2008), suggesting that sox7 inhibition might cause reduced PGC number by influencing gene expression downstream of Wnt signaling.
Irie et al. (2015) have recently identified the F-type Sox family member SOX17 as the primary regulator of human PGC-like fate. Interestingly, similar to the Xenopus F-type family member sox7 (Zhang et al., 2005b), human SOX17 has historically been reported as crucial for endoderm specification. F-type Sox genes regulate the expression of germline-specific genes, such as those of the Nanos family or DND1, and pluripotent genes such as OCT4 (POU5F1) and NANOG (Irie et al., 2015). These recent findings, along with our observations, have led us to hypothesize that Sox7 is a key transcription factor necessary to specify PGCs in Xenopus. Future studies are necessary to establish when sox7 is translated and whether it partners with another transcription factor(s) in PGCs to activate germline-specific gene expression programs.
Efnb1 is required for normal PGC development and migration
Our functional studies reveal a novel role for efnb1 in PGC maintenance and migration (Fig. 7). efnb1 is expressed in BBs, suggesting that it is an early component of germ plasm. After MBT, PGC-specific efnb1 expression is lost but it is re-expressed at neurula (Fig. 2A). These observations suggested that efnb1 might function in both the endoderm and germline lineages. MO-mediated knockdown of efnb1 did not affect PGC number, but caused PGCs to migrate outside their normal boundaries, primarily into posterior endoderm. By contrast, efnb1 overexpression decreased the number of PGCs but did not affect migration and development. These results suggest that efnb1 is also involved in signaling pathways necessary for normal PGC development. Both the mismigration and loss of PGCs could be rescued, indicating specificity of the observed phenotypes (Fig. 7).
Ephrin ligands and Eph receptors are known to contribute to the maintenance of vertebrate tissue boundaries (Rohani et al., 2014), to regulate axon migration (Klein and Kania, 2014) and establish A/P gradients required for proper cell migration (Bush and Soriano, 2010). Therefore, efnb1 knockdown-mediated PGC mismigration may be due to disruption of the A/P axis. Consistent with this interpretation, PGCs were not ectopically found outside of the endoderm nor was PGC number affected. Interestingly, Enfb1 associates directly with Dishevelled and is capable of recruiting it and Grip2 to the plasma membrane (Brückner et al., 1999; Moore et al., 2004; Lee et al., 2006). Grip2 encodes a scaffolding protein known to interact with receptors including Frizzled1 (Korkut et al., 2009; Ataman et al., 2006). Knockdown of Grip2 in the embryo disturbs normal PGC migration (Kirilenko et al., 2008; Tarbashevich et al., 2007), similar to our results with efnb1 (Fig. 7). Taken together, these observations suggest a close physical association between Grip2, Efnb1 and the Wnt signaling components that facilitates correct PGC migration.
Unlike somatic cells, PGCs are known to divide symmetrically only two or three times before exiting the endoderm (Whitington and Dixon, 1975). Disruption of the cell cycle or inappropriate gene expression in PGCs would trigger their cell death (Lai et al., 2012). efnb1 interacts with at least two signaling pathways: FGF (Moore et al., 2004; Lee et al., 2006) and Wnt (Lien and Fuchs, 2014; Lee et al., 2006). Disruption of these pathways could affect cell proliferation or differentiation, which might explain the effect on PGC number when efnb1 is overexpressed.
Inhibition of p300 results in a loss of PGCs
Here we show for the first time that p300 is necessary for proper PGC development (Fig. 8). Pharmaceutical inhibition of p300 caused a significant reduction in PGC number, suggesting a role in proliferation, apoptosis and/or cell cycle regulation. Similar to what has been shown in retinal cells, p300 may promote PGC proliferation and protect PGCs from apoptosis by modulating the activity of Stat1 and Stat3 (Kawase et al., 2016). Alternatively, p300 might be necessary to allow PGCs to pass through G1 of the cell cycle in order to proliferate, consistent with its role in leukemia cells (Gao et al., 2013). Further investigation is necessary to determine the precise molecular mechanisms by which p300 regulates PGC number.
Our RNA-seq analysis has revealed interconnected pathways highlighting the vegetal pole as a major signaling center. Interestingly, grip2, which encodes an important scaffolding protein in the nervous system, was the most abundant vegetal mRNA identified in our analysis. Grip2 is likely to represent a key scaffolding protein for the assembly, through its PDZ domains, of multi-protein signaling complexes. The challenge now is to functionally test the pathways revealed by our comprehensive list of localized mRNAs and to define the maternal contributions to both germline and somatic cell fates.
MATERIALS AND METHODS
Isolation of animal and vegetal pole samples
X. laevis adult frogs were purchased from Xenopus Express. Ovarian tissue was surgically removed from anesthetized females, then oocytes were enzymatically released from ovarian tissue (Sive et al., 2000) and stage VI oocytes were selected. Oocyte-matched vegetal and animal poles (10-20% of total oocyte for each pole) were cut with a razor blade (Cuykendall and Houston, 2010) (for further details, see the supplementary Materials and Methods). Prior to collection, the germinal vesicle (GV) was manually removed from animal pole samples to ensure that GV retention of transcripts with different final locations would not contribute to false positives in our RNA-seq analysis. RNA was extracted (see the supplementary Materials and Methods) and 25 vegetal and animal oocyte-paired poles were collected per frog. Equal concentrations of RNA from the respective poles of oocytes from three different frogs were combined to make one vegetal and one animal pole paired sample. Oocytes from a total of nine frogs were used to form three vegetal and three animal pole samples that were submitted for RNA-seq analysis. All animal protocols were approved by the Institutional Animal Care and Use Committee of the University of Miami.
RNA preparation for Illumina sequencing
For each sample, 1 µg of total RNA was processed for RNA quality with the Agilent Bioanalyzer 2100. Samples were processed for both RNA-seq and small (sm)RNA-seq. RNA-seq samples were depleted of mitochondrial and ribosomal RNAs with the ScriptSeq Complete Gold Kit (Illumina) and subjected to ten cycles of PCR prior to RNA sequencing on the Illumina HiSeq 2000 using the reagents provided in the Illumina TruSeq PE Cluster Kit v3 and the TruSeq SBS Kit-HS (200 cycle) kits. Reads aligning to a ribosome-specific reference or mitochondrial sequences represented <5% and 1.28% of the total, respectively. For smRNA-seq, samples underwent 11 cycles of PCR and were then prepped using the Illumina TruSeq Small RNA Sample Preparation Guide (15004197 Rev. D). Cluster generation was performed on the Illumina cBot according to the manufacturer's recommendations. An average of 10.1 million (animal) and 19.4 million (vegetal) pass-filter paired-end 100 base reads were generated per sample (range: 9.0-11.2 million, animal; 17.8-22.7 million, vegetal).
Data processing and quantification
Quality control metrics were determined using FastQC software (Babraham Bioinformatics). The total library size was 2.8 Gb, with 95% of the total base pairs above internal FastQC thresholds. Variance between samples was minimal with sample reads ranging from 24.6-26.8 and 21.2-21.5 million read pairs for animal and vegetal pole samples, respectively. Raw reads were aligned using TopHat v2 RNA-seq analysis software. To assess the quality of the reference genome, reads were aligned to both X. laevis (v6.0 and v7.1) and X. tropicalis (v7.1) references. Alignment to the X. laevis genome ranged from 66.42%-71.45% and the X. tropicalis genome ranged from 25.21%-39.27%. Therefore, reads from the X. laevis genome alignment were quantified.
Transcripts were quantified using Cufflinks v2.1. At least 21,555 transcripts were detected in each sample, and of those 13,930 had a fragments per kilobase of transcript per million mapped reads (FPKM) value ≥5. Differential expression analysis was performed using Cuffdiff to compare vegetal versus animal pole transcripts with an FPKM value ≥5 (Mortazavi et al., 2008). 5717 total transcripts (FDR<0.05, fold change ≥1.95) were differentially expressed. More stringent criteria were then employed to determine transcripts enriched in the vegetal or animal pole. The bulk of large yolk platelets are within the vegetal hemisphere, reducing the yolk-free cytoplasm content there (Danilchik and Gerhart, 1987; Callen et al., 1980; Rebagliati et al., 1985); therefore, we set a criterion of 4-fold enrichment for vegetal versus animal pole transcripts. An inherent bias exists towards the animal pole based on the 4-fold difference in RNA concentrations along this axis; thus, transcripts enriched 10-fold in the animal pole compared with the vegetal pole were considered localized to the animal pole.
Small RNA analysis and novel non-coding RNA identification
For small RNA analysis, reads were aligned to a X. laevis-specific miRNA reference and then counted. X. tropicalis homologs of X. laevis miRNAs were queried for miRNA targets. To identify novel long non-coding (lnc)RNAs, Trinity software (Trinity version 2012-06-08) was used for de novo assembly of the transcripts in each sample. For details, see the supplementary Materials and Methods.
Gene name identification and GO analysis
Joint Genome Institute (JGI) annotation was used for gene name identification (Xenbase.org). To verify gene identity, separate BLAST alignments (NCBI) for the scaffolds were also performed. Genes homologous to those of humans were submitted to GeneCards for summary information (genecards.org). GeneGo software (MetaCore MetaCore Bioinformatics software from Thomson Reuters, https://portal.genego.com/) was used for pathway analysis. For further details, see the supplementary Materials and Methods.
Whole-mount in situ hybridization (WISH)
mRNAs of select genes were transcribed from respective cDNA clones and 0.5 ng injected into fertilized embryos. Tailbud stage embryos were then analyzed for PGC number and location using WISH for the germ plasm marker xpat. X. laevis otx1, wwtr1, efnb1 and sox7 expression were inhibited by MO and/or by overexpressing a dominant-negative construct (sox7). X. laevis p300 expression was pharmaceutically inhibited in fertilized embryos by incubation with the small molecule inhibitor C646. Tailbud-stage embryos were then analyzed for proper PGC number and location using WISH. P-values were determined using a two-tailed unpaired Student's t-test. P<0.05 was considered significant. For further details, see the supplementary Materials and Methods.
We thank Drs Jing Yang for the X. tropicalis plasmids for hook2, tob2 and sox7 and Mike Klymkowsky for the X. laevis sox7 dominant-negative construct. Drs Michael Gilchrist through the Bioinformatics Workshop (Cold Spring Harbor, NY) and Lingyu Wang provided excellent technical support for our RNA-seq analysis.
D.A.O.: Performed experiments, data analysis, figure preparation, helped write manuscript. A.M.B.: Planned experiments, performed functional experiments, data analysis, helped write manuscript. T.H.A.: Functional experiments, figure preparation, data analysis. K.M.N.: Data preparation and analysis. D.V.B.: Bioinformatics and analysis of RNA-seq data. M.L.K.: Conceived and planned experiments, wrote manuscript.
This work was supported by the National Institutes of Health (HD072340, GM102397 to M.L.K.). Deposited in PMC for release after 12 months.
All RNA-seq data are available at Gene Expression Omnibus with accession number GSE80971.
The authors declare no competing or financial interests.