Epigenetic modifications are crucial for the identity and stability of cells, and, when aberrant, can lead to disease. During mouse development, the genome-wide epigenetic states of pre-implantation embryos and primordial germ cells (PGCs) undergo extensive reprogramming. An improved understanding of the epigenetic reprogramming mechanisms that occur in these cells should provide important new information about the regulation of the epigenetic state of a cell and the mechanisms of induced pluripotency. Here, we discuss recent findings about the potential mechanisms of epigenetic reprogramming, particularly genome-wide DNA demethylation, in pre-implantation mouse embryos and PGCs.
Introduction
The variety of cellular states in multicellular organisms reflects the diversity of the transcriptional programme of cells, despite the fact that nearly all cells in any given organism bear an identical genome sequence. The transcriptional state of a cell is governed by a specific set of transcriptional regulators and also by chemical modifications of the genome, including cytosine methylation (5-methylcytosine; 5mC) and post-translational modifications of histone tails, which regulate the accessibility of transcriptional regulators to, and the on or off states or expression levels of, all genes in the genome (Bonasio et al., 2010). The stability of the phenotype of a cell upon mitosis/meiosis is considered to be underpinned by the stability of these modifications. Modifications that regulate the identity of a cell without changing the DNA sequence are referred to as epigenetic modifications (Bird, 2007; Bonasio et al., 2010), and the genome-wide state of the epigenetic status of a cell is known as the epigenome (Bernstein et al., 2007). Once specified during development or in adult physiology, the epigenome of a cell is stable upon mitosis/meiosis, and cell identities are maintained essentially for a lifetime (Bonasio et al., 2010).
During mammalian development, however, there are two crucial developmental stages and/or cell types in which the epigenome undergoes profound reprogramming: pre-implantation embryos and primordial germ cells (PGCs), the precursors both for oocytes and spermatozoa (Fig. 1) (Surani et al., 2007). Epigenetic reprogramming in these cells involves genome-wide demethylation of 5mC; 5mC plays a crucial role in genome imprinting, X inactivation, transposon silencing, the stability of centromeric/telomeric structures and gene expression in general. Over the past decade, the technologies that can be used to determine genome-wide 5mC distribution have dramatically evolved, leading to the identification of global 5mC distribution in a number of cultured cell lines and primary tissues (Lister et al., 2009; Suzuki and Bird, 2008).
In this article, we summarize and discuss recent discoveries about the nature and mechanism of epigenetic reprogramming, particularly those relating to genome-wide DNA demethylation in mouse pre-implantation embryos and PGCs. These findings are important not only for understanding the genetic and epigenetic basis of genome inheritance, but also for elucidating the mechanisms of artificially induced epigenetic reprogramming that may be of medical relevance (Hanna et al., 2010; Hayashi and Surani, 2009; Yamanaka and Blau, 2010). [For reviews on other associated epigenetic events such as histone modification changes and X-chromosome inactivation/reactivation, please see other recent reviews (Brockdorf, 2011; Hemberger et al., 2009; Payer and Lee, 2008; Probst and Almouzni, 2011).]
5-methylcytosine: an overview
Cytosine methylation and 5mC distribution
Genome-wide cytosine methylation states, especially those associated with genes, differ among cell types and function as a form of memory of the identity and developmental state of a cell (Lister et al., 2009). The enzymes that methylate cytosine to form 5-methylcytosine (5mC) have been well characterized (see Table 1). DNA methyltransferase (DNMT) 1 preferentially methylates hemi-methylated cytosines in CpG sequences and thus acts as a maintenance methyltransferase to maintain genome-wide methylation patterns during replication (Bestor et al., 1988; Bestor, 1992; Li et al., 1992). DNMT3A and DNMT3B can methylate unmethylated CpG sequences and hence function as de novo methyltransferases (Okano et al., 1998a). DNMT3L has no catalytic activity but recruits DNMT3A and DNMT3B to their targets by recognizing nucleosomes that carry unmethylated histone H3 lysine 4 (H3K4) (Aapola et al., 2000; Bourc’his and Bestor, 2004; Bourc’his et al., 2001; Hata et al., 2002; Ooi et al., 2007).
5mC occurs mostly in CpG sequences and, to a lesser extent, in CpHpG or CpHpH sequences (where H is A, C or T), especially in pluripotent cells (Lister et al., 2009; Ramsahoye et al., 2000; Tomizawa et al., 2011). Mammalian genomes are globally methylated: genes, transposons, repeat sequences and intergenic DNA are all subjected to methylation (Suzuki and Bird, 2008). 5mC can spontaneously deaminate to form thymine (T), creating T:G mismatches, and thus can be a source of point mutations across the genome (see Box 1 and references therein). Unmethylated sequences are most often found in CpG islands (CGIs) (see Glossary, Box 2), which are typically associated with gene promoters (Suzuki and Bird, 2008) and with ∼70% of genes. Based on their CpG ratio, GC content and on the length of the CpG-rich region, promoters are classified as being high-, intermediate- or low-CpG promoters (HCPs, ICPs and LCPs, respectively) (Weber et al., 2007).
5mC and histone modifications
5mC and histone modifications act in concert to form an appropriate epigenome during development and in adult cells (Cedar and Bergman, 2009). Generally, 5mCs are associated with transcriptionally repressive histone modifications, such as histone H3 lysine 9 di- (H3K9me2) or tri-methylation (H3K9me3). This is partly because 5mCs are recognized by methyl-CpG binding proteins, which recruit the histone deacetylase complex (see Glossary, Box 2) (Jones et al., 1998; Nan et al., 1998). The interaction of DNMT1 and G9a, a H3K9 methyltransferase, with the replication complex (see Glossary, Box 2) might also connect 5mC to H3K9me2 (Esteve et al., 2006; Hashimshony et al., 2003). Conversely, methylated H3K9 is bound by heterochromatin protein 1 (HP1), which recruits DNMT1 to confer DNA methylation (Fuks et al., 2003; Smallwood et al., 2007). The interaction of the H3K9 methyltransferases SUV39H1 (suppressor of variegation 3-9 homolog 1) and ESET (also known as SETDB1; SET domain, bifurcated 1) with DNMT3A and DNMT3B can also direct DNA methylation at H3K9me3 (Fuks et al., 2003; Lehnertz et al., 2003; Li et al., 2006). NP95 (also known as UHRF1, ubiquitin-like, containing PHD and RING finger domains 1) is a multi-domain protein that is essential for recruiting DNMT1 to replication foci by interacting with DNMT1 and binding to hemi-methylated DNA (Bostick et al., 2007; Fujimori et al., 1998; Sharif et al., 2007). NP95 also interacts with DNMT3A and DNMT3B (Meilinger et al., 2009), G9a (Kim et al., 2009) and H3K9me3 (Rottach et al., 2010), integrating the DNA methylation and H3K9 methylation pathways.
Unmethylated CpG sequences, most typically CGIs, by contrast, are generally associated with transcriptionally permissive/active acetylated H3K4, H3K4me2 and H3K4me3 (Guenther et al., 2007). CXXC finger protein 1 (CFP1) binds unmethylated CpG sequences via its CXXC zinc-finger domain and recruits the H3K4 methyltransferase SETD1, thereby inducing a H3K4me2/3-positive transcriptionally permissive/active chromatin state (Thomson et al., 2010). Polycomb repressive complex (PRC) 2 (see Glossary, Box 2), which catalyses H3K27me3 formation and induces a repressive chromatin state, also binds unmethylated CGIs and represses their promoter activity during development and in embryonic stem cells (ESCs) (Barski et al., 2007; Bernstein et al., 2006; Mikkelsen et al., 2007; Mohn et al., 2008; Pan et al., 2007; Zhao et al., 2007). In ESCs, PRC target genes are also often marked by H3K4me3, which creates a bivalent modification state that causes the chromatin of a gene to be configured in an active or inactive state depending on the subsequent signal the cell receives. In some cases, PRC-targeted sequences become DNA methylated (Mohn et al., 2008;
Box 1. 5mC: a major source of point mutations
5mC can be spontaneously deaminated to form thymine (T), creating a T:G mismatch (Duncan and Miller, 1980; Poole et al., 2001). Although T:G mismatches can be repaired by the base-excision repair (BER) system, which uses enzymes that have thymine DNA glycosylase activity, such as MBD4 (methyl CpG binding domain protein 4) and TDG (thymine DNA glycosylase) (Poole et al., 2001), they are often unrecognized and lead to point mutations after DNA replication. Thus, 5mC is a source of point mutations across the genome (Kondrashov, 2003). Indeed, owing to the high mutability of the 5mCpG sequence, the frequency of the CpG sequence is much lower (∼0.2-0.25×) than would be expected given the GC content of the genome both in mice and humans (Lander et al., 2001; Rollins et al., 2006; Saxonov et al., 2006; Waterston et al., 2002).
Weber et al., 2007), possibly through the interaction of EZH2 (enhancer of zeste homologue 2), a component of PRC2, with DNMT3A and DNMT3B (Vire et al., 2006).
DNA demethylation in mouse pre-implantation development
Mouse development commences with fertilization – the fusion of an ovulated haploid oocyte with a haploid spermatozoon. Up to blastocyst formation, parental genomes undergo extensive epigenetic reprogramming, most notably genome-wide DNA demethylation. However, some genomic regions escape demethylation at this stage, including centromeric repeats, intracisternal A particle (IAP) retrotransposons (∼1000 elements/mouse genome) and the differentially methylated regions (DMRs) (see Glossary, Box 2) that are present in parentally methylated imprinted genes, as well as in some non-imprinted genes (Borgel et al., 2010; Lane et al., 2003; Reik et al., 2001; Rougier et al., 1998).
DNA demethylation in mouse zygotes
A recent genome-wide bisulfite sequence analysis (see Box 3 for more about bisulfite sequence analysis and other techniques for assaying DNA methylation), which covered ∼1% of the mouse genome, reported that ∼80% of CpG sequences are methylated in sperm (Popp et al., 2010). The maternal genome in mouse oocytes has lower levels of genome-wide DNA methylation than does the paternal genome in sperm, although the precise extent of genome-wide DNA methylation in the maternal genome has yet to be determined (Howlett and Reik, 1991; Monk et al., 1987; Smallwood et al., 2011). Within 1 hour of fertilization, the paternal genome releases protamine and is re-packaged by maternal nucleosomal histones, forming the paternal pronucleus. Either subsequently or concomitantly, the paternal pronucleus enlarges substantially by incorporating further maternal proteins, such as stella (also known as PGC7 and DPPA3, developmental pluripotency associated 3) and nucleoplasmin 2 (NPM2) (Li et al., 2010).
The development of the zygote is defined by the pronuclear stages P0/1 to P5 (Adenot et al., 1997; Santos et al., 2002). P0, P1 and P2 embryos are in the G1 phase, P3 and P4 embryos are largely in the S phase, replicating both the paternal and maternal genomes, and P5 embryos are mostly in the post-replicative G2 phase (Adenot et al., 1997). Several reports have shown that the paternal genome undergoes genome-wide DNA demethylation before replicating its DNA (before or in early P3) via an active mechanism (Mayer et al., 2000; Oswald et al., 2000; Santos et al., 2002; Wossidlo et al., 2010). By around P3 (7-8 hours post-fertilization), the paternal genome appears to lose a substantial amount of 5mC immunofluorescence (see Box 3 for more about immunofluorescence analysis), whereas the maternal genome retains it at a seemingly constant level. Treatment of zygotes with aphidicolin, which blocks DNA replication, has no effect on the loss of 5mC immunofluorescence from the paternal genome, indicating that a replication-independent, active DNA demethylation mechanism occurs in early zygotes (Mayer et al., 2000; Oswald et al., 2000; Santos et al., 2002; Wossidlo et al., 2010).
In mouse zygotes that lack stella, a maternal-effect protein that is essential for pre-implantation development (Payer et al., 2003), a substantial decrease of 5mC immunofluorescence is also observed in the maternal genome, indicating that the maternal genome is normally protected from demethylation by stella (Nakamura et al., 2007). Bisulfite sequence analysis has also shown that in stella-deficient zygotes, some of the paternally [H19, Rasgrf1 (RAS protein-specific guanine nucleotide-releasing factor 1)] and maternally [Peg (paternally expressed gene) 1, Peg3, Peg10] methylated imprinted genes, as well as IAPs, are demethylated at the P5 stage: these genes remain methylated at this stage in wild-type zygotes. Therefore, stella also protects some imprinted genes from demethylation. The reason(s) that stella can protect only paternally imprinted genes and the maternal genome remains unknown, particularly because stella localizes to both the paternal and maternal pronucleus (Nakamura et al., 2007; Payer et al., 2003). Underlying differences in chromatin modifications between the paternal and maternal genome may account, at least in part, for this asymmetric action of stella (Nakamura et al., 2007).
Although the immunofluorescence-based observations described above appear to be significant, bisulfite sequencing analyses offer less substantial evidence for DNA demethylation in mouse zygotes before they undergo DNA replication (Lane et al., 2003; Wossidlo et al., 2010). For example, the levels of 5mC in LINE1 (see Glossary, Box 2) elements (∼6×105 elements/mouse genome, see Glossary, Box 2) in zygotes at P1 is about ∼68% and drops to ∼53% at early P3 and then to ∼27% after DNA replication (Wossidlo et al., 2010). The CpG methylation level of early retrotransposons (ETn) (∼300-400 elements/mouse genome) at P1 is about ∼77% and drops to ∼61% at early P3; after DNA replication, methylation levels somehow then rise to ∼73% (Wossidlo et al., 2010). Another study reported on the demethylation of LINE1 and ETn elements by the P4 stage; LINE 1 methylation drops from 87% to 55% and ETn from 89% to 66% (Okada et al., 2010). The demethylation of imprinted genes and IAPs in stella-deficient zygotes is also observed after DNA replication (Nakamura et al., 2007). Thus, although the extent of DNA demethylation in zygotes after DNA replication is substantial, that occurring before DNA replication, by active DNA demethylation, seems less prominent. The discrepancy between the immunofluorescence and bisulfite sequence data thus needs to be resolved. As we discuss below, the recent discovery that 5mC can be oxidized to form 5-hydroxymethylcytosine (5hmC) might offer such a resolution.
5hmC: an intermediate for DNA demethylation?
5hmC, a stable hydroxylated metabolite of 5mC, was first identified in the genome of T-even bacteriophages (Wyatt and Cohen, 1953) and is produced as an oxidation damage product from 5mC (Burdzy et al., 2002; Zuo et al., 1995). However, the
Box 2. Glossary
Base excision repair (BER). A DNA repair pathway that removes mismatched DNA bases, followed by incision of the 5′ phosphodiester bond of the abasic site and gap filling by a DNA polymerase.
CpG island. A genomic region not depleted of CpGs that is typically 200-500 bp in length, has a minimum observed:expected CpG ratio of >0.6 and a minimum GC content of 50-55%.
Differentially methylated region (DMR). A genomic region that is differentially DNA methylated between the two parental chromosomes.
Elongator complex. A protein complex that associates with the RNA polymerase II holoenzyme in transcription elongation and exhibits histone acetyltransferase activity.
Endosperm. A nutritive tissue of flowering plant seeds that is formed by the fertilization of the maternal central cell
HIRA. A histone chaperone for H3/H4 for nucleosome assembly independent of DNA replication.
Histone deacetylase complex. A transcription repressor complex that involves Sin3 and histone deacelylases.
LINE1 (long interspersed nuclear element 1). A retrotranspon-derived genetic element that encodes reverse transcriptase and integrase. Around 6×105 LINE1 elements are present in the mouse genome, constituting ∼19% of the genome.
NAP1 (nucleosome assembly protein 1). A histone chaperone for H2A/H2B and H1/B4 that removes histones and is implicated in transcription factor binding to DNA.
Nucleotide excision repair (NER). A DNA repair pathway that recognizes DNA lesions that result in conformational distortions, such as a thymine dimer.
Polycomb repressive complex (PRC). Transcriptional repressor complexes. PRC1 contains ubiquitin ligases RING1A and RING1B, which catalyze mono-ubiquitylation of Lys 119 of histone H2A. PRC2 contains EED, SUZ12, RbAp46/48 and EZH1/2, a histone methyltransferase responsible for di-/tri-methylation of H3K27.
Replication complex. A macromolecular structure in which eukaryotic DNA replication occurs.
SIN3A repressor complex. A transcription repressor complex that contains SIN3A and histone deacetylases, together with other proteins.
TET (ten-eleven translocation). Proteins that catalyze oxidization of 5-methylcytosine and produce 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxylcytosine in a 2-oxyoglutarate- and Fe(II)-dependent manner, through conserved catalytic domains (Cys-rich and dioxygenase domains).
Thymine DNA glycosylase. DNA glycosylases that catalyze the base excision of thymine mismatched with guanine (e.g. TDG and MBD4).
5mC glycosylase/lyase. An enzyme that removes 5-methylated cytosine from the backbone sugar of DNA.
discovery that it is a physiologically relevant DNA modification in mammals, such as in mouse neurons and ESCs (Kriaucionis and Heintz, 2009; Tahiliani et al., 2009), is a more recent novel finding.
The hydroxylation of 5mC to 5hmC is catalyzed by a family of dioxygenases, the TET (ten-eleven translocation) 1/2/3 proteins (see Glossary, Box 2), which have different tissue distributions (Cimmino et al., 2011; Szwagierczak et al., 2010; Tahiliani et al., 2009) (Table 1). 5hmC is abundant in the brain (∼40% and ∼13% as abundant as 5mC in Purkinje and granule cells, respectively), but is present at lower levels in other mouse tissues (Kriaucionis and Heintz, 2009). 5hmC is detected in mouse ESCs (∼7-10% as abundant as 5mC) but is undetectable in human T cells and in mouse dendritic cells (Tahiliani et al., 2009). These findings raise the possibility that demethylation of 5mC to cytosine occurs via the generation of a 5hmC intermediate, which is in turn converted into unmethylated cytosine. Furthermore, recent studies show that TET proteins further convert 5hmC into 5-formylcytosine (5fC) and then into 5-carboxylcytosine (5caC) (He et al., 2011; Ito et al., 2011; Pfaffeneder et al., 2011). The genomic contents of these cytosine derivatives in mouse ESCs are, however, very low, e.g. 20 5fC and three 5caC in every 106 Cs (5hmC is about 1.3 × 103 in every 106 Cs), and the significance of these modifications needs to be clarified (Ito et al., 2011).
5hmC may also be a biological end-product of demethylation, as methyl-CpG binding proteins have a significantly lower affinity for 5hmC (Valinluck et al., 2004). DNMT1 also recognizes 5hmC very poorly in vitro (Valinluck and Sowers, 2007); thus, 5hmC might passively convert into cytosine during replication. A more recent report, however, shows that Np95, which recruits DNMT1 to replication foci, recognizes 5hmC as efficiently as it does 5mC (Frauer et al., 2011), raising the possibility that 5hmC may have the same capacity as 5mC for 5mC propagation during replication. The biological significance of 5hmC thus requires further clarification.
As we discuss in more detail below, recent studies of the TET proteins have revealed more about their genome-wide binding sites, their functions in epigenetic reprogramming and about the genome-wide distribution of 5hmC.
5hmC and TET proteins in mouse ESCs
Mouse ESCs highly express Tet1, express Tet2 to a lesser extent and do not express Tet3 (Ito et al., 2010; Koh et al., 2011). Upon ESC differentiation, both Tet1 and Tet2 are downregulated (Ito et al., 2010; Koh et al., 2011). When knocked down by RNAi, Tet1 and Tet2 were found to be involved in regulating the expression of pluripotency transcription factors, such as Nanog, Esrrb (estrogen-related receptor β) and Prdm14 (PR domain containing 14) (Ito et al., 2010; Koh et al., 2011; Ficz et al., 2011; Williams et al., 2011). The depletion of Tet1 and Tet2 skews ESC differentiation towards the extra-embryonic lineages (Ito et al., 2010; Koh et al., 2011; Ficz et al., 2011; Williams et al., 2011). Surprisingly, however, Tet1 knockout ESCs, which show a ∼35% reduction in 5hmC levels, exhibit only subtle changes in gene expression, are pluripotent and support full-term mouse development in the tetraploid complementation assay (Dawlaty et al., 2011). To examine the possibility that Tet1 and Tet2 are functionally redundant and to investigate 5hmC functions further in ESCs, Tet1 and Tet2 double knockout ESCs will need to be generated in the near future.
Chromatin immunoprecipitation followed by DNA sequencing (ChIP-seq) for TET1 has shown that most TET1-binding sites are in the transcribed regions of genes, with the highest density around transcription start sites (TSSs) of HCPs and ICPs (Williams et al., 2011; Wu et al., 2011; Xu et al., 2011); Williams et al. (Williams et al., 2011), for example, reported ∼6500 TSSs with TET1-binding sites. The CXXC zinc-finger domain of TET1 is required to recruit TET1 onto CpG-rich sequences (Xu et al., 2011). TET1 binding is positively correlated with H3K4me3 and also with bivalent chromatin modifications. 5hmC immunoprecipitation followed by DNA sequencing (hMeDIP-seq, see Box 3) shows that 5hmC is also enriched within gene bodies and at TSSs of HCPs and ICPs (Ficz et al., 2011; Pastor et al., 2011; Williams et al., 2011; Xu et al., 2011). Williams et al. (Williams et al., 2011), for example, identified ∼2400 TSSs enriched for 5hmC. In DNMT1/DNMT3A/DNMT3B triple-knockout (TKO) cells that lack all 5mC (Tsumura et al., 2006), nearly all 5hmC signals were found to be absent (Ficz et al., 2011; Szwagierczak et al., 2010;
Box 3. Methods for genome-wide quantitation of DNA methylation
Methylation-sensitive/dependent enzyme digestion
Genomic DNA samples digested with methylation-sensitive and - insensitive enzymes (e.g. HpaII and MspI, respectively) are compared by microarrays (Tompa et al., 2002) or by next-generation sequencing (Oda et al., 2009). Genomic DNA can also be digested by McrBC, which digests nearly all methylated CpG islands (Sutherland et al., 1992), and then compared with non-digested DNA (Irizarry et al., 2008; Lippman et al., 2004). These methods depend on enzyme restriction sites.
Methylated/hydroxymethylated DNA immunoprecipitation (MeDIP/hMeDIP)
5-methylcytosine (5mC) or 5-hydroxymethylcytosine (5hmC) in fragmented genomic DNA is enriched by immunoprecipitation using specific antibodies and analyzed using microarray (MeDIP/hMeDP-chip) (Weber et al., 2005) or next-generation sequencing (MeDIP/hMeDIP-seq) (Down et al., 2008). These methods do not quantify absolute 5mC/5hmC levels. The densities of 5mC and 5hmC influence the efficiencies of immunoprecipitation (Pastor et al., 2011; Weber et al., 2007). The GLIB (glucosylation, periodate oxidation, biotinylation) method enables the efficient pulldown of 5hmC, even when it is at a low density (Pastor et al., 2011). MeDIP-chip has been combined with DNA amplification and applied to ∼104 cells from early mouse embryos (Borgel et al., 2010).
Bisulfite sequencing
Genome DNA is treated with sodium bisulfite, which converts cytosine, but not 5mC/5hmC, to uracil, and analyzed using next-generation sequencing (Cokus et al., 2008; Lister et al., 2008). This method determines methylation sites at single-base resolution, and, with sufficient read depths, absolutely quantifies methylation levels. However, 5mC and 5hmC are indistinguishable by this method. By reducing genome representation with MspI digestion (Meissner et al., 2005), this method has been applied to ∼103 cells from oocytes and pre-implantation embryos (Smallwood et al., 2011).
Immunofluorescence analysis
5mC/5hmC are marked by specific antibodies and by fluorophore-conjugated secondary antibodies in situ, followed by fluorescent microscopic analysis. This method cannot distinguish the methylation state of specific sequences, but detects genome-wide methylation levels in single cells. 5mC/5hmC antibodies detect densely methylated sequences efficiently but single CpG methylation less efficiently (Pastor et al., 2011; Suzuki and Bird, 2008). As transposon-related elements occupy ∼40% of the genome and genes only ∼2-3% (Lander et al., 2001; Waterston et al., 2002), most 5mC/5hmC signals should be derived from the methylation of transposon-related elements.
Williams et al., 2011). These findings indicate that in ESCs, 5hmC is generated from pre-existing 5mC by the action of TET1, and that a significant fraction of 5mC is converted to 5hmC at the TSSs of HCPs and ICPs, where 5mC becomes depleted. One of the functions of TET1 would therefore be to remove aberrant stochastic DNA methylation from HCPs and ICPs, thereby regulating DNA methylation fidelity in ESCs.
TET1 contributes to the transcriptional repression of a fraction [∼7-8%, as reported previously (Williams et al., 2011)] of its target genes and to a lesser extent their transcriptional activation [∼3% as reported previously (Williams et al., 2011)]. Most of the transcriptional effects of TET1 are independent of the conversion of 5mC to 5hmC, as TET1 has similar transcriptional activity in DNMT TKO ESCs (Williams et al., 2011). Instead, TET1 contributes to transcriptional repression by forming a complex with the SIN3A repressor complex (see Glossary, Box 2) or indirectly by recruiting PRC2 (Williams et al., 2011; Wu et al., 2011). Whether the catalytic activity of TET1 is required for the functions of TET1 in ESCs thus remains to be explored.
5hmC and DNA demethylation in mouse zygotes
Immunofluorescence analysis has shown that 5hmC levels elevate on the paternal genome at around postnatal day (P) 3, concomitant with the reduction of 5mC (Gu et al., 2011; Iqbal et al., 2011; Wossidlo et al., 2011). This elevation occurs independently of DNA replication, and 5hmC persists at least until the two-cell stage (Iqbal et al., 2011; Wossidlo et al., 2011). In stella-deficient zygotes, 5hmC increases and 5mC decreases on both the paternal and maternal genomes (Wossidlo et al., 2011), and when Tet3 is knocked down (Tet3 is highly expressed in oocytes and zygotes), 5mC levels increase whereas 5hmC levels reduce, compared with wild type, on the paternal genome (Wossidlo et al., 2011). Importantly, a maternal knockout of Tet3 leads to a failure in the elevation of 5hmC and in the reduction of 5mC from the paternal genome, impaired promoter demethylation of Oct4 (Pou5f1, POU domain, class 5, transcription factor 1) and Nanog, delay in the activation of a paternally derived Oct4 transgene, and frequent death of the resulting embryos (Gu et al., 2011). These findings suggest that, in normal development, the paternal genome is targeted by TET3, which converts 5mC to 5hmC from around the P3 stage onwards and that the TET3-mediated hydroxylation of 5mC accounts, at least in part, for the active DNA demethylation of the paternal genome.
Crucially, both 5mC and 5hmC are resistant to deamination by bisulfite treatment and are indistinguishable in bisulfite sequence analysis (Hayatsu and Shiragami, 1979). This may explain why the paternal genome seems to retain persistent levels of methylation by bisulfite sequence analysis, despite the fact that it shows highly reduced 5mC immunofluorescence. The development of a technology that can discriminate between cytosine, 5mC and 5hmC by quantitative sequencing analysis is crucial for obtaining more detailed information on DNA demethylation of the paternal genome in zygotes.
As discussed earlier, the mechanism by which 5hmC is converted into cytosine in zygotes remains unclear. In the plant Arabidopsis thaliana, it is already well established that DNA demethylation involves 5mC glycosylases/lyases (see Glossary, Box 2; Box 4) and the base excision repair (BER) pathway (see Glossary, Box 2) (Zhu, 2009). This pathway contributes to the genome-wide DNA demethylation that occurs in the endosperm (see Glossary, Box 2) (Gehring et al., 2009; Hsieh et al., 2009). Although there are as yet no known mammalian homologues of plant 5mC glycosylases/lyases, there is evidence that the BER, but not the nucleotide excision repair (NER) (see Glossary, Box 2), pathway is involved in the DNA demethylation of the mammalian paternal genome (Hajkova et al., 2010; Wossidlo et al., 2010; Ziegler-Birling et al., 2009). Accordingly, γH2A.X, the Serine139 phosphorylated form of the histone H2 protein H2AX, which marks DNA strand breaks, and PARP1 [poly(ADP-ribose) polymerase family, member 1], a sensor of single-stranded DNA (ssDNA) breaks and a component of the BER pathway, are recognized specifically on the paternal genome at early P3 (Hajkova et al., 2010; Wossidlo et al., 2010; Ziegler-Birling et al., 2009). At this stage, XRCC1 (x-ray repair complementing defective repair in Chinese hamster cells 1), a core BER component, is tightly bound only to the paternal genome. In stella-deficient zygotes, XRCC1 binds to both the paternal and maternal
Box 4. DNA demethylation in plants
Compelling genetic and biochemical evidence exists in Arabidopsis thaliana that active DNA demethylation is carried out by 5-methylcytosine (5mC)-specific glycosylases/lyases of the DEMETER family, which consists of four members: DME; repressor of silencing 1 (ROS1, also known as DML1); DML2; and DML3 (Zhu, 2009). These proteins remove 5mC by a glycosylation reaction and cleave one of the phosphodiester bonds. Subsequently, the remaining sugar and phosphate group is removed by an apurinic/apyrimidinic endonuclease and a phosphodiesterase, a proper nucleotide is then inserted by a DNA repair polymerase, and the nick is sealed by a DNA ligase. DME functions to demethylate transposable elements and imprinted genes globally in the endosperm of plants, thereby allowing their parent-of-origin specific (maternal) expression (Gehring et al., 2009; Hsieh et al., 2009).
genome and inhibition of PARP and APE1 (apurinic/apyrimidinic endonuclease 1) activity results in the reduced demethylation of the paternal genome (Hajkova et al., 2010). It is possible that the TET3-mediated hydroxylation of 5mC on the paternal genome directly or indirectly triggers the BER pathway. This possibility needs to be verified experimentally.
Other mechanism of DNA demethylation in mouse zygotes
It has been reported that components of the elongator complex (see Glossary, Box 2), including ELP1, ELP3 and ELP4, are involved in the pre-replicative DNA demethylation of the paternal genome (Okada et al., 2010). The radical SAM (S-adenosylmethionine) domain but not the HAT (histone acetyltransferase) domain of ELP3 appears to be required for this activity. The mechanism by which the elongator complex is involved in demethylating the paternal genome remains to be explored.
DNA demethylation in pre-implantation embryos
There is evidence that DNA methylation is passively removed both from the paternal and maternal genomes from the first S-phase (one-cell stage) up to the morula/early blastocyst stage (Howlett and Reik, 1991; Kafri et al., 1992; Lane et al., 2003; Monk et al., 1987; Oda et al., 2006; Okano et al., 1999; Rougier et al., 1998). Embryos at the morula/early blastocyst stage are therefore considered to bear substantially lower levels of genome-wide DNA methylation than do zygotes. A study that used reduced representation bisulfite sequencing (RRBS) (see Box 3) has recently shown that CGIs that are methylated in mature oocytes are indeed demethylated in blastocysts, but not to the extent that would be expected if passive demethylation occurs at every cleavage division, indicating that mechanisms of DNA demethylation in pre-implantation embryos need to be further investigated (Smallwood et al., 2011). Given that ∼10 primitive ectoderm (PEct) cells constitute the inner cell mass (ICM) of ∼E4.0-4.5 blastocysts and give rise to all somatic and germ cells, it remains an important challenge to elucidate the epigenome of the primitive ectoderm.
DNA demethylation in pre-implantation embryos could partly be due to a reduction in DNMT1, as DNMT1, but not DNMT3A or DNMT3B, immunofluorescence is excluded from the nucleus during pre-implantation development (Branco et al., 2008; Hirasawa et al., 2008). The DNA methylation of imprinted genes, IAPs and centromeric repeats is, however, maintained during this period (Borgel et al., 2010; Lane et al., 2003; Reik et al., 2001; Rougier et al., 1998). Interestingly, the conditional knockout of both maternal and zygotic Dnmt1 leads to a complete erasure of DNA methylation at imprinted genes in the blastocyst, demonstrating that DNMT1, which is present in the nuclei of pre-implantation embryos at a low level that is undetectable by immunofluorescence analysis, is sufficient to maintain the DNA methylation of imprinted genes (Hirasawa et al., 2008). The conditional knockout of both maternal and zygotic Dnmt3a and Dnmt3b leads to a partial demethylation of a paternally imprinted gene, Rasgrf1, at E9.5 (Hirasawa et al., 2008), indicating that imprinting maintenance requires the presence of all three DNMTs. The maintenance of DNA methylation at IAPs in pre-implantation embryos also depends on DNMT1 function (Gaudet et al., 2004).
Recent studies have re-examined the long-held view that the DMRs of imprinted genes are resistant to genome-wide DNA demethylation during pre-implantation development (Kobayashi et al., 2006; Tomizawa et al., 2011). Accordingly, the DMRs of imprinted genes, particularly of paternally imprinted genes, are partly demethylated during pre-implantation development, especially at their peripheral regions, and are subsequently remethylated, exhibiting an unexpectedly dynamic regulation (Tomizawa et al., 2011). The mechanism that targets DNMTs to demethylation-resistant sequences remains to be clarified.
Functional significance of DNA demethylation
What is the functional significance of DNA demethylation in pre-implantation embryos? It is most likely to be in the creation of the pluripotent epigenome of the primitive ectoderm (PEct) (Surani et al., 2007).
Genome-wide promoter methylation in sperm seems to be generally similar to that in ESCs and embryonic germ cells (EGCs), except at certain loci that encode pluripotency factors, such as Nanog and Brd1 (bromodomain containing 1) (Farthing et al., 2008). Therefore, the genome-wide DNA demethylation of the paternal genome in the zygote may occur preferentially at transposons, such as at LINE1 elements, and perhaps in intergenic and intragenic regions (Farthing et al., 2008). Given that most imprints conferred during germ cell development are maternally derived, demethylation of the paternal genome may be a consequence of a need for the maternal cytoplasm to erase paternal imprints (Reik and Walter, 2001).
One report has shown that when round spermatids, the DNA of which is still associated with histones, are injected into oocytes by round spermatid injection (ROSI), paternal genome demethylation is not observed, whereas when mature sperm, the DNA of which is associated mainly with protamines, are injected into oocytes by intracytoplasmic sperm injection (ICSI), the paternal genome is demethylated, indicating that the protamine-histone exchange that occurs in the paternal genome once it is in the oocyte may cause paternal DNA demethylation (Polanski et al., 2008). Notably, both ROSI- and ICSI-derived embryos develop to term at the same ratio, indicating that paternal genome demethylation may have no functional significance in development (Polanski et al., 2008). As this study used only immunofluorescence analysis to detect 5mC, it is possible that functionally important DNA demethylation of the ROSI-derived paternal genome escaped detection.
Epigenetic reprogramming in primordial germ cells
The blastocyst at implantation (∼E4.0-E4.5) consists of three cell types, the trophectoderm (TE), PEct and the primitive endoderm (PE) (Rossant and Tam, 2009) (Figs 1, 2). After implantation, the PEct gives rise to the epiblast, the source of all somatic cells, including the PGCs (Figs 1, 2). Genome-wide DNA methylation levels increase in PEct-derived tissues in response to the activities of DNMT3A and DNMT3B (Borgel et al., 2010; Kafri et al., 1992; Oda et al., 2006; Okano et al., 1999). For example, many gene-specific CpG sequences that are demethylated by the blastocyst stage, are remethylated by E6.5 (Kafri et al., 1992). A more recent study has shown that during implantation, de novo DNA methylation conferred mainly by DNMT3B is primarily targeted to the CGIs of many germline genes, as well as to lineage-specific genes, to repress their expression (Borgel et al., 2010). Moreover, the methylation levels of the major satellite sequences, of LINE1 elements and of IAP elements increase from the blastocyst to the E8.5 stage from 15 to 80% (major satellites), 30 to 80% (LINE1 elements) and from 60 to 95% (IAP elements) (Oda et al., 2006). Compared with the PEct-derived embryonic tissues, the TE-derived placenta remains hypomethylated, with 30% of major satellites, 40% of LINE1 and 65% of IAPs being methylated at E9.5 (Oda et al., 2006). At E13.5, the genome-wide methylation level of placenta is 43.2% (Popp et al., 2010).
PGC specification takes place in the most proximal epiblast in response to bone morphogenetic protein (BMP) signalling from the extra-embryonic ectoderm at ∼E6.0 (Lawson et al., 1999) (Fig. 2A,B). At this stage, epiblast cells are still pluripotent but are being propelled towards somatic fates and are in the process of losing their pluripotency (Kurimoto et al., 2008). BMP signalling induces the expression of the transcriptional regulators BLIMP1 (also known as PRDM1, PR domain containing 1, with ZNF domain) and PRDM14, in the most proximal epiblasts at ∼E6.25 and E6.5, respectively; the BLIMP1- and PRDM14-positive cells go on to form a cluster of ∼40 alkaline phosphatase (AP)-positive PGCs at the base of the incipient allantois at ∼E7.25 (Ginsburg et al., 1990; Ohinata et al., 2009; Ohinata et al., 2005; Vincent et al., 2005; Yamaji et al., 2008) (see Fig. 2). These established PGCs shut down the somatic transcriptional programme (for example, by turning off Hox gene expression), re-acquire the expression of pluripotency factors (such as Sox2) and prepare for the epigenetic reprogramming that manifests after E7.75. From ∼E7.5, PGCs start to migrate to the hindgut, from where they migrate to the mesentery and finally to the genital ridges, which they colonize by E10.5 to initiate sexually dimorphic development (Kurimoto et al., 2008; Saitou et al., 2002; Seki et al., 2005; Seki et al., 2007) (Fig. 2A,C). In PGCs, BLIMP1 is required for the repression of the somatic programme, and both BLIMP1 and PRDM14 are involved in the re-expression of pluripotency factors and in epigenetic reprogramming (Kurimoto et al., 2008; Yamaji et al., 2008).
Although the precise nature of the epigenome of the pre-gastrulating epiblast and of established PGCs at E7.25 is unknown and requires further investigation, we do know that, in early PGCs, methylation at imprinted loci is maintained (Hajkova et al., 2002; Lee et al., 2002), that one X chromosome in females is inactivated (Sugimoto and Abe, 2007; Tam et al., 1994) and that transposable elements, such as LINE1 and IAP, are relatively highly methylated (both are ∼70% methylated at E11.5) (Hajkova et al., 2002). It is therefore likely that PGCs at their outset bear a genome-wide DNA methylation pattern that is comparable with that of somatic cells at the same stage.
DNA demethylation in PGCs
The most striking epigenetic event in PGCs is the genome-wide DNA demethylation that encompasses genic, intergenic and transposon sequences, which is completed in both sexes by E13.5 (see Fig. 3). As a consequence of this demethyation, one inactivated X-chromosome in females is reactivated, imprinted loci are fully demethylated and methylation at most transposable elements is erased (Hayashi and Surani, 2009). Notably, long terminal repeat (LTR) retrotransposon sequences, including IAPs, are more resistant to demethylation (Hajkova et al., 2002; Lane et al., 2003; Popp et al., 2010) and can cause transgenerational epigenetic inheritance (Whitelaw and Whitelaw, 2008).
A genome-wide bisulfite sequence analysis (covering ∼1% of the genome) has quantified levels of 5mC in PGCs at E13.5, as well as in various cell types, and has shown that both male and female PGCs are extremely hypomethylated relative to other tissues (Popp et al., 2010). For example, whereas median methylation levels at CpGs in sperm are 85%, in ESCs they are 75%, in E13.5 embryos they are 73.2%, and in placenta they are 43.2%, those in E13.5 male and female PGCs are only 16.3% and 7.8%, respectively. Methylation levels in PGCs are substantially lower than the level of 22% that has been recorded in methylation-deficient Np95–/– ESCs. The fact that female PGCs have considerably lower methylation levels than do male PGCs may be because, owing to X-reactivation, female PGCs bear two active X-chromosomes, which may encode a modifier locus to lower genome-wide methylation levels (Zvetkova et al., 2005).
Dynamics of DNA demethylation in PGCs
The genome-wide DNA methylation state of PGCs before E13.5 has not yet been reported. Nonetheless, several studies have examined the timing of the demethylation of imprinted genes, some single-copy genes and transposable elements in PGCs from E10.5 to E12.5/E13.5 (Hajkova et al., 2002; Lane et al., 2003; Lee et al., 2002). In general, the timing of demethylation depends on the genes being analyzed and is thus heterogeneous. This finding might also reflect heterogeneity in the timing of demethylation in each PGC. One study has identified the rapid demethylation of imprinted genes between E11.5 and E12.5, proposing the involvement of active demethylation (Hajkova et al., 2002) when considering the doubling time of PGCs of ∼16 hours (Tam and Snow, 1981); the DMRs of maternally methylated genes Snrpn (small nuclear ribonucleoprotein N), Peg3 and Lit1 [also known as Kcnq1ot1 (potassium voltage-gated channel, subfamily Q, member 1, overlapping transcript 1)] are all nearly fully methylated at E11.5, but become almost fully demethylated at E12.5.
Conversely, other studies have demonstrated a gradual erasure of methylation at several imprinted genes and retrotransposons, such as LINE1 and IAP (Lane et al., 2003; Lee et al., 2002). Notably, imprinted genes such as Nnat (neuronatin), H19 and Peg10 are already partly (∼50%) demethylated at E10.5 (Lee et al., 2002). These observations are compatible with the occurrence of replication-dependent passive demethylation. More comprehensive measurements of DNA methylation states during PGC development should provide further insights into the dynamics, and hence the mechanism, of DNA demethylation.
Active DNA demethylation in PGCs?
There is evidence that the cytosine deaminases AID (activation-induced deaminase) and APOBEC1 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1) (Box 5, Fig. 4, Table 1) can convert 5mCs to thymines by deamination, creating T:G mismatches that might then become targets of thymine glycosylases (see Glossary, Box 2), such as MBD4 (methyl CpG binding domain protein 4) or TDG (thymine DNA glycosylase; see Glossary, Box 2), which then trigger the BER pathway (Morgan et al., 2004). AID-deficient male and female PGCs at E13.5 show median methylation levels of ∼22% and 20%, respectively, which are higher than the methylation levels of wild-type male and female PGCs (16.3% and 7.8%, respectively), indicating that AID functions in genome-wide DNA demethylation in PGCs (Popp et al., 2010). Importantly, AID deficiency does not impact genome-wide methylation levels in cells/tissues other than PGCs (Popp et al., 2010). As the methylation levels of AID-deficient PGCs are still lower compared with those of earlier wild-type PGCs, the demethylation events occur even without AID, possibly owing to compensation by other deaminases, including APOBEC1/2/3.
However, it is important to note that both AID-deficient male and female mice are relatively healthy (except for their B-cell-derived phenotype) and fertile, although some abnormalities in
Box 5. AID and APOBECs: cytidine deaminases
AID (activation-induced deaminase) and APOBECs (apolipoprotein B mRNA editing enzyme, catalytic polypeptides) are a group of cytidine deaminases in vertebrates that can introduce mutations in DNA and RNA by deaminating cytidine to uridine. AID is involved in class switch recombination (CSR) and in somatic hypermutation (SHM) of the immunoglobulin (Ig) genes (Muramatsu et al., 2000; Revy et al., 2000). In one model, in activated B-cells, AID deaminates cytosines into uracils on the Ig loci, which creates U:G mismatches, triggering the error-prone DNA repair system (Di Noia and Neuberger, 2007). Consequently, U:G mismatches occurring in the V, D and J genes lead to affinity maturation or gene conversion, whereas U:G mismatches occurring in the switch regions lead to CSR (Di Noia and Neuberger, 2007). Another model posits that AID deaminates unidentified mRNA, leading to the production of a potential endonuclease that cleaves DNA during the immune response (Honjo et al., 2005). In the absence of AID, none of these events occurs (Muramatsu et al., 2000; Revy et al., 2000). APOBEC1 is an RNA deaminase that converts cytidine to uridine. Most typically, it edits apolipoprotein B RNA, generating a truncated apolipoprotein B in a tissue-specific manner (Conticello, 2008). APOBEC3 functions to restrict the activity of viruses and retrotransposons in primates by editing their DNAs (Conticello, 2008). Interestingly, both Aid and Apobec1 are located in close proximity to Nanog and stella/Pgc7 (Dppa3, developmental pluripotency-associated 3) on mouse chromosome 6. This may account for the expression of Aid and Apobec1 in pluripotent cell lineages, such as oocytes, embryonic stem cells and primordial germ cells (Morgan et al., 2004).
litter size and progeny birth weights have been reported (Popp et al., 2010). Considering that the deregulated dose of even a single imprinted gene profoundly affects development and adult physiology, the genome-wide DNA demethylation that occurs in PGCs, which erases imprints and contributes to the creation of proper imprinted gene dose, should be a crucial event. Therefore, the finding that DNA demethylation deficiencies in AID-mutant PGCs does not lead to profound reproductive defects, such as infertility, subfertility or marked adult phenotypes, appears to be counter-intuitive. It has also been shown that AID is unable to act on double-stranded DNA and that 5mC is a much more inefficient target for AID-mediated deamination than is unmethylated cytosine in vitro (Bransteitter et al., 2003; Di Noia and Neuberger, 2007; Larijani et al., 2005), raising the issue of whether AID can directly deaminate 5mC in vivo.
In support of the active removal of 5mC during DNA demethylation in PGCs, the BER, but not the NER, pathway has been reported to operate in PGCs during their genome-wide DNA demethylation (Hajkova et al., 2010). As in the paternal pronucleus of the zygote, in ∼E11.5 PGCs, components of this pathway, including XRCC1, APE1 and PARP1 are found to be enriched in their nuclei, and XRCC1 is found bound to PGC chromatin, suggesting that ssDNA breaks are present in PGCs. Hajkova et al. have shown that, as a potential consequence of the DNA demethylation mediated by the DNA repair mechanisms, PGCs at ∼E11.5 show dramatic changes in their chromatin states, including rapid loss of linker histone H1, loss of detectable chromophores, significant enlargement of nuclei, and a concomitant loss of H3K9me3, H3K27me3, H4/H2AR3me2s and H3K9ac (Fig. 3). The loss of H3K9me3 and H3K27me3 seems transient, with these modifications being recovered after E12.5, whereas the loss of H4/H2AR3me2s and H3K9ac seems persistent. These dynamic changes are possibly mediated through histone replacement, perhaps by the histone chaperone HIRA (histone cell cycle regulation defective homolog A) or NAP1 (nucleosome assembly protein 1) (see Glossary, Box 2) (Hajkova et al., 2008).
Although the expression of Aid, Apobec1 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1), Mbd4 and Tdg (thymine DNA glycosylase) is low in PGCs from E10.5 to E12.5, significant expression of Tet1 has been found in these cells (Hajkova et al., 2010), indicating that, in PGCs, TET1 may convert 5mC into 5hmC, which could be cleaved by an as yet unidentified 5hmC glycosylase, leading to the activation of BER. However, recently generated Tet1 knockout mice are viable and fertile, and mating between homozygous mutant males and females produces viable progeny, although with a reduced average litter size (three to six pups compared with wild-type litters of five to nine pups) (Dawlaty et al., 2011). The role of TET proteins in gametogenesis and fertility thus requires further investigation.
Passive DNA demethylation in PGCs?
The extent of genome-wide DNA demethylation in PGCs is extraordinary and more global compared even with that in pre-implantation embryos, because, in PGCs, genome imprints are erased and the demethylation of transposable elements is more extensive (Hajkova et al., 2002; Lane et al., 2003; Lee et al., 2002; Popp et al., 2010). This suggests the presence of mechanisms unique to PGCs, allowing nearly complete DNA demethylation. Although AID and TET1 are suggested to be part of this process, these molecules are expressed elsewhere (e.g. B-cells and ESCs), in which genome-wide DNA demethylation is not reported. To understand more fully the mechanism of this extensive genome-wide DNA demethylation, it is important to investigate the events that are unique to PGCs.
Upon PGC specification, Dnmt3b, Dnmt3a and Np95 are transcriptionally repressed (Kurimoto et al., 2008; Seki et al., 2005), although PGCs continue to express Dnmt1. GLP and G9a, the histone methyltransferases that confer the H3K9me2 mark to chromatin during development and in ESCs (Tachibana et al., 2002; Tachibana et al., 2005), are also repressed in PGCs at around E7.5 and E9.5, respectively (Kurimoto et al., 2008; Seki et al., 2005). PGCs continue to repress these molecules at least until E12.5. Thus, PGCs have little to no DNA methyltransferase and H3K9 di-methylase activity from soon after their specification (∼E7.5) to E12.5.
Immunofluorescence studies indicate that, during the migration period, PGCs show reduced genome-wide DNA methylation, and exhibit decreases in H3K9me2 and increases in H3K27me3 in a progressive, cell-by-cell manner. By E9.5, when PGCs emigrate out into the mesentery, nearly all of them bear low H3K9me2 and high H3K27me3 levels (Seki et al., 2005; Seki et al., 2007). Western blot analysis confirms that PGCs at E12.5 have highly reduced H3K9me2 and significantly elevated H3K27me3 levels compared with E6.5 epiblasts and with somatic cells in the gonads at E12.5 (Seki et al., 2005).
These findings, together with the cell cycle dynamics of migrating PGCs, indicate that their genome-wide DNA demethylation might occur partly through a passive mechanism (Fig. 3). The low H3K9me2 state of PGCs may be of relevance to their DNA demethylation, because in G9a/Glp-knockout ES cells, which show highly reduced H3K9me1/2, some single-copy genes and retrotransposable elements are DNA demethylated even in the presence of the three DNMTs (Dong et al., 2008; Tachibana et al., 2008). The timing of DNA demethylation in PGCs might depend on the target preference of the residual DNMTs. Indeed, as discussed earlier, in pre-implantation embryos, in which genome-wide DNA methylation levels substantially decrease by a presumably passive mechanism, the methylation of DMRs at imprinted genes is maintained by the activity of DNMT1, which is expressed at a very low level (Hirasawa et al., 2008). Moreover, the maintenance of DNA methylation at some sequences, including at retrotransposons, requires cooperation between DNMT1, DNMT3A and DNMT3B (Chen et al., 2003; Liang et al., 2002), indicating that DNA methylation patterns can be altered by the absence of even one of these three enzymes.
The conversion of 5mC into 5hmC and its subsequent passive demethylation may also be a potential DNA demethylation pathway in PGCs. The fact that TET1 binding (and hence the presence of 5hmC) is enriched in the promoters of LINE1 elements but is absent at repetitive elements, such as at IAP and minor satellite repeats in ESCs (Ficz et al., 2011; Williams et al., 2011; Wu et al., 2011), may account for the preferential demethylation at LINE1 elements but the relatively persistent presence of 5mC at IAP and minor satellite repeats in PGCs.
Active DNA demethylation in other contexts
Active DNA demethylation is reported to occur in a highly locus-specific fashion and to control gene expression in various contexts. There is evidence that GADD45A (growth arrest and DNA-damage-inducible 45a), a protein involved in the maintenance of genomic stability, DNA repair and suppression of cell growth, has a role in active DNA demethylation through the NER pathway in cultured fibroblasts (Barreto et al., 2007). It has also been reported that overexpression of AID and MBD4 in zebrafish embryos leads to active DNA demethylation through a combined pathway of 5mC deamination by AID followed by thymine base excision by MBD4, which is promoted by GADD45 (Rai et al., 2008). In somatic cell-ESC fusion experiments, AID has also been shown to facilitate epigenetic reprogramming towards pluripotency, which requires DNA demethylation (Bhutani et al., 2010). RNF4 (RING finger protein 4), a SUMO-dependent ubiquitin E3-ligase implicated in the maintenance of genome stability, has also been shown to have a role in active DNA demethylation both in mouse embryonic development and in cultured cells (Hu et al., 2010): in RNF4-deficient embryonic fibroblasts, DNA methylation at imprinted genes, such as at Peg1 and Peg3, is elevated from ∼50% to ∼75%, indicating that maintenance of the unmethylated state of the DMRs of the paternal alleles of these genes requires protection (demethylation) from erroneous methylation (Hu et al., 2010). RNF4 interacts with and requires TDG and APE1 for active demethylation, indicating the involvement of the BER pathway in this process (Hu et al., 2010).
In adult neurons, activity induced GADD45B has been shown to demethylate DNA actively at promoters of key genes involved in adult neurogenesis and to induce their expression (Ma et al., 2009). Furthermore, another study shows that TET1 and APOBEC1 are involved in neuronal activity-induced region-specific active DNA demethylation and subsequent gene expression in the dentate gyrus of the adult mouse brain (Guo et al., 2011). This study shows that TET1 promotes DNA demethylation in human cultured cell lines, and this requires the BER pathway. In this system, 12 known human DNA glycosylases have been shown to not act directly on 5hmC. However, AID and APOBECs can efficiently deaminate 5hmC into 5hm uracil (5hmU) (AID cannot deaminate 5mC efficiently), which is then a preferable target for DNA glycosylases such as SMUG1 (single-strand selective monofunctional uracil DNA glycosylase) and TDG in their activation of the BER pathway. AID-mediated 5hmC deamination recapitulated the properties of the AID-mediated cytosine deamination observed in B cells, such as processivity, sequence selectivity, transcription dependence and strand preference. Thus, this study proposes a TET1-induced oxidation-deamination mechanism for active DNA demethylation (Guo et al., 2011).
Gene-knockout studies have revealed that the DNA glycosylase TDG has important functions in mouse embryogenesis (Cortazar et al., 2011; Cortellino et al., 2011). It has been shown to maintain the unmethylated state of CGIs at the promoters of developmentally regulated genes, such as Hoxa10, Hoxd13, Sfrp2 (secreted frizzled-related protein 2), Twist2 (twist homolog 2) and Rarb (retinoic acid receptor β) (Cortazar et al., 2011). In wild-type mouse embryonic fibroblasts, the CGI at the promoters of these genes are free of 5mC and are associated with H3K4me2, but in TDG-deficient cells, they are aberrantly methylated and are associated with H3K27me3. On the promoters of wild-type cells, TDG forms a complex with BER pathway components, including XRCC1, APE and PARP1, and with the transcription-activating histone acetyltransferase CBP/p300 and the H3K4-specific methyltransferase MLL1. Interestingly, although TDG also associates with the promoters of such genes in ESCs, the epigenetic aberrations only manifest upon their differentiation, indicating that TDG contributes to the maintenance of active chromatin during cell differentiation, facilitating a proper assembly of the chromatin modifying complex and undergoing BER to counter aberrant de novo methylation (Cortazar et al., 2011). Another study supports the conclusion of the above-mentioned study and furthermore shows that, in TDG mutants, imprinted genes such as H19 and Igf2 show hypermethylation and that the developmentally regulated demethylation of the albumin gene enhancer fails to occur (Cortellino et al., 2011). Moreover, TDG forms a complex with AID and GADD45A, and shows a strong glycosylase activity towards 5hmU (Cortellino et al., 2011). Thus, the authors propose a two-step mechanism for DNA demethylation in mammals, in which 5mC or 5hmC is first deaminated by AID to thymine or 5hmU, respectively, which is then excised and repaired by the TDG-mediated BER pathway.
However, some of these pathways may not have a role in DNA demethylation in pre-implantation embryos and in PGCs. For example, Mbd4-deficient mice are fertile, and genome-wide DNA demethylation appears to occur normally in Mbd4-deficient zygotes (Millar et al., 2002; Santos and Dean, 2004). One study has shown that GADD45A has no DNA demethylation activity (Jin et al., 2008), and another that Gadd45a-deficient mice have neither loci-specific nor global defects in DNA methylation levels (Engel et al., 2009). In addition, the knocking out of Gadd45b does not affect the paternal DNA demethylation in zygotes (Okada et al., 2010). Furthermore, Apobec1 knockout mice are fully fertile (Hirano et al., 1996; Morrison et al., 1996). There remains a possibility that the negative results of these knockout experiments are due to functional redundancy with other proteins. As such, we should await the results of compound mutants, such as Gadd45a/Gadd45b-double knockout mice, Aid/Apobec1-double knockout mice or Aid/Tet1-double knockout mice. The conditional deletion of TDG in PGCs and in oocytes should also provide important new insights into the role of TDG in genome-wide DNA demethylation in pre-implantation embryos and in PGCs.
Conclusion
Despite recent considerable progress, much remains to be learned about the mechanisms and the consequences of the epigenetic reprogramming in pre-implantation embryos and in PGCs. As multiple and compound pathways for active DNA demethylation have been reported, future genetic analyses of candidate components of these DNA-demethylation pathways are required to substantiate their proposed mechanisms. This will require the creation of compound mutants, i.e. double or triple knockouts, as the candidate enzymes for DNA demethylation belong to families with similar activities. In PGCs, in addition to active DNA demethylation, replication-dependent passive DNA demethylation may also be involved, and this possibility should be examined experimentally by overexpressing key repressed genes in PGCs. At the same time, a more comprehensive determination of the mode of DNA demethylation, including the analysis of hemi-methylation states during critical developmental periods is crucially required to obtain new insights into the mechanisms of DNA demethylation. Genome-wide quantification of underlying histone modifications would also provide key information about how epigenetic reprogramming proceeds. The development of new technologies to quantify genome-wide epigenetic modifications from small amounts of starting materials would also help to advance research in this field, as would new procedures to reconstitute epiblast and PGC development from pluripotent stem cells in vitro, in order to provide greater quantities of experimental material for such experiments (Hayashi et al., 2011). Concerted efforts along these lines will help to clarify the mechanisms of epigenetic reprogramming and may lead to the development of a strategy that will ultimately allow us to control the epigenetic state of a cell in vitro.
Acknowledgements
We thank the members of our laboratory for their input.
Funding
The authors are supported in part by a Grant-in-Aid from the Ministry of Education, Culture, Sports, Science and Technology of Japan; by JST-CREST/ERATO; by the Takeda Science Foundation; by the Uehara Memorial Foundation; and by the Mitsubishi Foundation.
References
Competing interests statement
The authors declare no competing financial interests.