Small non-coding RNAs make up much of the RNA content of a cell and have the potential to regulate gene expression on many different levels. Initial discoveries in the 1990s and early 21st century focused on determining mechanisms of post-transcriptional regulation mediated by small-interfering RNAs (siRNAs) and microRNAs (miRNAs). More recent research, however, has identified new classes of RNAs and new regulatory mechanisms, expanding the known regulatory potential of small non-coding RNAs to encompass chromatin regulation. In this Commentary, we provide an overview of these chromatin-related mechanisms and speculate on the extent to which they are conserved among eukaryotes.
Introduction
Although RNA was initially considered to be a passive intermediate in the flow of information from DNA to protein, the regulatory capacity of RNA is now well established. For several decades, it has been known that non-coding RNAs, such as prokaryotic ~100-nucleotide MicC and micF, can modulate translation efficiency through base pairing to an mRNA and altering its accessibility to ribosomes (Mizuno et al., 1984). However, the discovery of RNA interference (RNAi) by Craig Mello and Andrew Fire in 1998 showed that there is a class of much smaller RNAs (20-30 nucleotides) that have a high regulatory capacity (Fire et al., 1998). These small regulatory RNAs represent an ancient regulatory mechanism, as mirrored by their presence in organisms ranging from yeast to plants and animals. Moreover, their silencing principles are widely conserved and typically involve a protein of the Argonaute family (Box 1). Over the past few years, it has been revealed that there is a world of small non-coding RNAs that not only regulate gene expression on post-transcriptional and transcriptional levels, but also affect the organization and modification of chromatin. In this Commentary, we concentrate on this latter role of small non-coding RNAs, highlighting their involvement in transposon control, centromere function, silencing of the unpaired sex chromosome during meiosis, paramutation and DNA elimination. Before delving into these specific areas, we first introduce concepts concerning RNAi and chromatin that are relevant to these topics.
Small-interfering RNAs (siRNAs) of ~21 nucleotides are the central players in many RNAi mechanisms (Fig. 1). They can be produced from long double-stranded RNA (dsRNA) through cleavage by the endonuclease Dicer and are bound by an Argonaute protein. siRNAs identify their target by perfect sequence complementarity and establish silencing of the recognized mRNAs (Fig. 1A). The source of the initial dsRNA can be either exogenous (e.g. from viral replication) or endogenous (Fig. 1B). Expression of endogenous siRNAs is often tissue- or developmental-stage-specific, with a bias towards expression in reproductive tissues or stages.
The genomes of many organisms encode machinery to amplify siRNAs after target recognition (Baulcombe, 2007): the target recognized by the primary siRNA recruits an RNA-directed RNA polymerase (RdRP), which synthesizes new RNA that is antisense to the target (Fig. 1). In some organisms, this results in the formation of new dsRNA, which is processed by Dicer into secondary siRNAs; in other organisms, Dicer is not involved in the amplification step and the RdRP is thought to synthesize single-stranded secondary siRNAs directly. Although only a subset of eukaryotes (including Schizosaccharomyces pombe, Neurospora, Caenorhabditis elegans and plants) encode an obvious RdRP enzyme in their genomes, several other animal species were recently found to posses RdRP activity mediated by divergent enzyme complexes (Lipardi and Paterson, 2009; Maida et al., 2009). This suggests that amplification of siRNAs is potentially more widely conserved than the conservation of the RdRP enzyme suggests.
Many animals (including Drosophila, zebrafish and mammals) carry a germline-enriched class of small RNAs, called Piwi-interacting RNAs (piRNAs) (for a review, see Klattenhoff and Theurkauf, 2008). piRNAs are longer (24-30 nucleotides) than siRNAs and have some conserved sequence features due to their biogenesis pathway. They bind to a germline-specific subset of Argonaute proteins, referred to as Piwi proteins (Box 1). Their biogenesis is independent of Dicer, but depends on an amplification cycle catalyzed by the Piwi proteins themselves.
Two major types of chromatin-silencing modifications have been correlated with small-RNA-mediated genome surveillance: histone modification and DNA methylation. Histone proteins consist of a globular core and extending tails at their N-terminal and C-terminal ends. Several different amino acid modifications have been described in histone proteins, mostly within the N-terminal tails, including acetylation, methylation, ubiquitylation, SUMOylation, ADP ribosylation and phosphorylation [for a detailed review, see Kouzarides (Kouzarides, 2007)]. Modifications of histone tails can alter the interactions between the histones and other nucleosomes, or can create binding sites for a variety of chromatin-modifying proteins that translate (combinations of) modifications into chromatin-condensing or -relaxing activities. An acetylated arginine residue, for example, is typically recognized
Box 1. Argonautes
Argonaute proteins are small-RNA-binding proteins that are characterized by the presence of three domains: the PAZ domain, the Mid domain and the PIWI domain (see Figure, part A). The PAZ domain forms a binding pocket for the 3′ end of the small RNA, whereas the 5′-terminal nucleotide is bound by the Mid domain [for a review and the crystal structure, see Ender and Meister (Ender and Meister, 2010)]. The PIWI domain has catalytic endonuclease activity, referred to as slicing. The slicer activity allows the Argonaute to degrade one of the strands of the dsRNA, thereby exposing the single-stranded RNA and allowing it to detect its target by sequence complementarity (Leuschner et al., 2006; Matranga et al., 2005; Rand et al., 2005; Steiner et al., 2009). Slicer activity is also involved in the degradation of target mRNAs, although it is often not required for downstream silencing effects; several Argonautes have been identified that do not possess this endonuclease activity but still function in silencing pathways (Yigit et al., 2006).
The genomes of most organisms encode multiple Argonaute proteins. An Argonaute typically binds to one specific type of small RNA (an miRNA, siRNA or piRNA) and, by interacting with other proteins, plays a decisive role in determining the downstream effects of the small RNA. Argonautes are roughly divided into two subclasses based on their alignment: the Ago proteins (yellow area in Figure, part B), which resemble Arabidopsis AGO1, and the Piwi proteins (green area in Figure, part B), which resemble Drosophila Piwi. Ago proteins are present in both plants and animals, but Piwi proteins are only found in animals and are generally germline-specific. In worms, a third type of Argonaute is present, most of which bind to secondary siRNAs (Yigit et al., 2006). At, Arabidopsis thaliana; Ce, Caenorhabditis elegans; Dm, Drosophila melanogaster; Hs, Homo sapiens; Nc, Neurospora crassa.
by proteins that have a bromodomain and is associated with active transcription. Conversely, lysine methylation can be associated with either activation or repression of transcription, depending on the exact site of the mark. Methylation of lysine 9 of histone 3 (H3K9me) can be recognized by chromodomain proteins, such as heterochromatin protein 1 (HP1), which recruits deacetylating activity to the locus and induces packaging into inactive heterochromatin. Methylated lysine 4 of the same histone (H3K4me) can be bound by double-chromodomain proteins, such as CHD1, or by plant homeodomain (PHD) proteins, and recruits enzymatic activities that promote active, open chromatin. It is, however, important to note that the effect of a histone modification is highly dependent on its total context – it is too simplistic to regard individual modifications as being activating or repressing. In the context of small-RNA-induced chromatin modification, the H3K9 methylation has been best described. This modification is recognized by the RNA-induced transcriptional silencing (RITS) complex, as will be discussed below.
Although not common to all eukaryotes, methylation of DNA itself constitutes a more stable type of modification than histone modification. Cytosines, particularly those in a CpG context, can be methylated in the DNA of plants and mammals. CpG methylation is symmetrical and found in both sister chromatids following genome duplication. Non-CpG methylation often is asymmetrical and must be readjusted after every cell division to be maintained, and is therefore less stable than symmetrical methylation. Non-CpG methylation has mainly been described in plants. Both types of DNA methylation generally induce the formation of compact, inactive chromatin, although in plants some methylation of open reading frames is associated with active transcription (for a review, see Munshi et al., 2009; Quina et al., 2006).
Control of transposable elements
Transposable elements were first discovered in the 1940s (McClintock, 1950) and are now known to make up a large portion of the genomes of most organisms. For example, 10% of the Arabidopsis genome consists of transposons and transposon remains, and transposons account for 45% of the sequence of the human genome. The ability of transposons to integrate at novel sites in the genome makes them intrinsically mutagenic and therefore an important silencing target. Early on, the phenomenon of transposon silencing was found to have substantial genetic overlap with RNAi in C. elegans (Ketting et al., 1999; Sijen and Plasterk, 2003; Tabara et al., 1999; Tijsterman et al., 2002; Tops et al., 2005; Vastenhouw et al., 2003). However, it is still unknown whether transposon silencing in worms happens at a post-transcriptional or transcriptional level.
siRNAs and transposon silencing in Arabidopsis thaliana
Plants use special 24-nucleotide siRNAs to control transposons in their genomes (Chan et al., 2004; Hamilton et al., 2002; Llave et al., 2002; Qi et al., 2006). The origin of these siRNAs is still largely unknown, but is probably some type of dsRNA (Fig. 1B). The induction of transposon silencing in plants is a two-step process; each step involves a plant-specific RNA polymerase (Daxinger et al., 2009) (Fig. 2A). In the first step, RNA polymerase IV (Pol IV) assists in the creation of an siRNA population by initiating transcription from specific loci. Pol IV is required for transposon control, as loss of Pol IV leads to transcriptional reactivation of several transposons (Huettel et al., 2006). How Pol IV identifies the target loci, however, is currently unclear. A second enzyme involved in initiating siRNA biogenesis is the RdRP RDR2. RDR2 might be involved in synthesizing dsRNA on the Pol IV transcripts or it might synthesize dsRNA from RNAs that have been recognized by pre-existing siRNAs. Finally, the RNAs are diced into 24-nucleotide siRNAs by the specific Dicer DCL3 and are bound by Argonaute protein AGO4.
The second step in plant transposon silencing involves the recruitment of chromatin modifications to the transposon sequence through binding by the siRNA-loaded AGO4. This step requires the plant-specific RNA polymerase Pol V (also known as Pol IVb) to create a platform for siRNA binding (Wierzbicki et al., 2008). It is possible that Pol V synthesizes RNA transcripts from the target loci, which can in turn be recognized by the siRNAs; an alternative possibility is that Pol V just opens up the chromatin to allow access to the transposon site.
Plant transposon loci are marked by histone H3K9 methylation (Ding et al., 2007; Gendrel et al., 2002; Mette et al., 2000; Sijen et al., 2001; Wassenegger, 2000) and by DNA methylation (Cokus et al., 2008; Lister et al., 2008; Miura et al., 2001). Histone methylation requires the histone methyltransferase KYP and also involves the chromatin-remodeling factors DRD1 and DDM1 (Chan et al., 2006; Hirochika et al., 2000; Jeddeloh et al., 1998; Kanno et al., 2004). All transposable elements in the genome seem to be targeted equally by H3K9 methylation (Tran et al., 2005). DNA methylation by the symmetrical CpNpG methyltransferase CMT3 marks the same loci as KYP, whereas the CpG methyltransferase MET1 and the de novo methyltransferase DRM2 target only a subset of transposable elements (Tran et al., 2005). Although each of the individual methylation marks contributes to the silencing of transposon loci, the extent to which each contributes is different (Cao and Jacobsen, 2002; Kato et al., 2003; Zhang et al., 2006).
The three silencing mechanisms discussed above (siRNAs, histone methylation and DNA methylation) are intricately related (Johnson et al., 2002; Lippman et al., 2003). The DNA CpG methyltransferase MET1 (but not CMT3) directs the distribution of H3K9 methylation to target loci (Johnson et al., 2002; Soppe et al., 2002; Tariq et al., 2003). In turn, H3K9 methylation can recruit the CpNpG methylase CMT3, thereby reinforcing DNA methylation (Jackson et al., 2002; Lindroth et al., 2004). Argonaute AGO4 is involved in both histone methylation and non-CpG DNA methylation (Tran et al., 2005; Zilberman et al., 2003), which places the small RNAs downstream of MET1 and upstream of KYP and CMT3. The molecular mechanism of this interdependence is not entirely clear, but it is conceivable that the chromatin-remodeling factors are recruited to the DNA by AGO4 (the positioning of which is probably directed by Pol V activity), where they facilitate access to the locus for de novo DNA methyltransferases and histone methyltransferases.
In support of this view, AGO4 was shown to interact with Pol V (Li et al., 2006; Pontier et al., 2005) and with Pol-V-generated RNA at the target loci (He et al., 2009), and to direct DNA methylation to the transposon site. However, although all transposons are targeted by siRNAs, loss of AGO4 leads to de-silencing of only some transposons (Lu et al., 2006). In addition, Mosher et al. found that Pol V can silence a subset of transposons by DNA methylation without a requirement for siRNAs (Mosher et al., 2008). Together, these results support a model whereby siRNAs direct silencing of some unstably silenced transposable elements, whereas constitutively silenced transposons are regulated by an RNA-independent mechanism in which siRNAs might be synthesized as a back-up system.
piRNAs and transposon silencing in animals
In Drosophila and vertebrates, siRNAs play a role in transposon silencing in somatic tissues (Chung et al., 2008; Czech et al., 2008; Ghildiyal et al., 2008; Kawamura et al., 2008), but in germ cells transposons are mainly controlled by piRNAs, which bind to members of the Piwi clade of Argonaute proteins. A large fraction of Drosophila and zebrafish piRNAs map to transposon sequences (Brennecke et al., 2007; Houwing et al., 2007) and transposons are de-silenced in Piwi mutants (Aravin et al., 2001; Houwing et al., 2008; Vagin et al., 2004). These data support the idea that there is a close connection between piRNAs and transposon regulation.
How are the transposon-silencing effects of piRNAs established? In Drosophila, DNA methylation is less abundant than in plants and the main plant CpG methyltransferase, MET1, has no homolog in Drosophila. It was recently shown that the methyltransferase Dnmt2 can methylate retrotransposons in somatic cells (Phalke et al., 2009), but this enzyme appears to be inactive in the germ line and is therefore unlikely to be directed by piRNAs. Instead, several indications suggest that piRNA-mediated regulation of transposons in Drosophila correlates with histone modification. Mutations in Piwi-type Argonautes and related proteins induce mislocalization of H3K9 methylation and chromodomain proteins HP1 and HP2 (Klenov et al., 2007; Pal-Bhadra et al., 2004). Moreover, Piwi interacts directly with HP1α, the Drosophila HP1 variant that is associated with transgene silencing (Brower-Toland et al., 2007). Furthermore, the RNA helicase Spindle E, which is required for transposon silencing, is also required for H3K9 methylation of several transposon classes (Klenov et al., 2007). Although the proteins constituting this link between piRNAs and histone methylation have not yet been identified, the Drosophila homolog of KYP, the H3K9 methyltransferase Su(var)3-9, is a likely candidate.
In mice, two different subsets of piRNAs can be distinguished by the developmental stage in which they are expressed: pachytene piRNAs, which are bound by the Piwi proteins Mili and Miwi, and pre-pachytene piRNAs, which bind to Mili and Miwi2. Pachytene piRNAs mostly map to non-transposable sequences and appear to play a role in the regulation of meiosis. No links to chromatin have been demonstrated for these piRNAs so far. By contrast, a substantial fraction of pre-pachytene piRNAs map to transposon loci (Aravin et al., 2006; Carmell et al., 2007; Girard et al., 2006; Watanabe et al., 2006) and, moreover, some transposon de-silencing is detected in the Piwi mutants mili and miwi2 (Aravin et al., 2007). In addition, transposons in mice are marked by DNA methylation and the re-establishment of this mark following demethylation in germ cells requires Piwi proteins Mili and Miwi2 (Aravin et al., 2008; Kuramochi-Miyagawa et al., 2008). Together, this suggests a role for piRNAs in the modification and silencing of transposon sequences.
There are three known active DNA methyltransferases (Dnmt1, Dnmt3a and Dnmt3b) in mice and DNA methylation of repetitive sequences by Dnmts is essential for development (Kato et al., 2007; Okano et al., 1999; Walsh et al., 1998). Interestingly, involvement of the catalytically inactive family member Dnmt3L is also required for correct targeting of transposon sequences, mainly during male meiosis (Bourc'his and Bestor, 2004; Hata et al., 2006; Webster et al., 2005). DNA methylation seems to be guided to transposons by H3K9 methylation (established by the H3K9 methyltransferase Suv39) and by the HP1 homolog Lsh1 (Huang et al., 2004; Lehnertz et al., 2003; Yan et al., 2003), although mutation of Suv39 results in only moderate upregulation of transposon activity (Martens et al., 2005). Unfortunately, the molecular links between piRNAs and Piwi proteins, and DNA methylation or histone methylation are currently unknown.
Although the small-RNA class that mediates transposon silencing differs between plants and mammals, the silencing effect is remarkably similar in both systems (Fig. 2). Small RNAs probably direct histone methylation and/or DNA methylation to achieve control of transposon loci. In addition, a subset of loci are very stably silenced independently of small RNAs, whereas other loci – probably the unstably silenced ones – require small RNAs to maintain silencing or to redefine the silencing through de novo methylation.
Satellites and centromeres
Repetitive sequences are required for chromosome dynamics, but are also dangerous in terms of genome stability. A clear example of a repetitive region that is of vital importance to the cell is the centromere. Both the centromeric core and the pericentromeric region contain numerous repeats of short sequences (known as satellites) in most organisms (Box 2). In addition, the centromeres of many higher eukaryotes contain high levels of transposons and transposon remains. Work primarily conducted in the fission yeast S. pombe has shown that the repetitive pericentromeric region is a prominent target for small-RNA-mediated chromatin modification.
Pericentromeric heterochromatin in S. pombe
In fission yeast, pericentromeric regions are targeted by siRNAs, resulting in the formation of heterochromatin (Reinhart and Bartel, 2002;
Box 2. Centromere organization
Chromosomes need functional centromeres for reliable segregation during cell division. The functional domains of the centromere are well conserved, although the sequences involved are highly divergent among organisms. The core domain (shown in green in upper figure) – marked by the centromere-specific H3 variant CENP-A – constitutes the platform on which the inner and outer kinetochores (light and dark blue, respectively) are built, and connects the chromatids to the microtubules of the spindle (black). The pericentromeric regions (brown) assist in the loading of CENP-A onto the central core and bind cohesins (which keep sister chromatids together during spindle formation). In addition, they form a rigid heterochromatic structure that helps the chromosome to orient its chromatids in a way that favors bipolar spindle attachment.
The S. pombe centromeric core consists of a short conserved core sequence flanked by the inverted inner repeat (imr). The pericentromeric region consists of outer repeats, which vary in length between chromosomes and contain alternating dg and dh elements. No transposon sequences are located in the S. pombe centromeric region, although dg and dh repeats might be derived from ancient transposon remains (Baum et al., 1994; Clarke et al., 1986; Hahnenberger et al., 1991; Kniola et al., 2001; Nakaseko et al., 1986; Partridge et al., 2000). The A. thaliana centromeric core region consists primarily of thousands of repeats of a 180 bp element (cen180) interspersed with multiple copies and remains of the Athila retrotransposon (106B; orange). The pericentromeric region also contains many copies of Athila and other retroelements and DNA transposons, as well as stretches of short transposon remains (Copenhaver et al., 1998; Copenhaver et al., 1999; Heslop-Harrison et al., 1999; Kumekawa et al., 2000; Richards et al., 1991). The human centromere consists mainly of α-satellite repeats. In the central region, these repeats are organized into higher-order repetitive arrays; however, not all of these arrays are occupied by CENP-A, so some of them are actually part of the pericentromeric region. Outside this area, there are monomeric α-satellite repeats, which are interrupted by blocks of SINE and LINE elements (orange) and some stretches of unique sequence (yellow) (Horvath et al., 2000; Jackson, 2003; Manuelidis and Wu, 1978; Schueler and Sullivan, 2006; Spence et al., 2002; Tyler-Smith et al., 1993; Warburton et al., 1997; Wevrick and Willard, 1989; Willard et al., 1983; Yang et al., 1982). C. elegans (not shown) is holocentric and therefore many sites along the chromosome act as a centromere and recruit centromeric proteins necessary for segregation. No specific centromeric sequences or nucleating regions have yet been described in worms (Buchwitz et al., 1999; Moore et al., 1999; Oegema et al., 2001).
Volpe et al., 2002) (for reviews, see Martienssen et al., 2005; Verdel et al., 2009). Three interacting protein complexes are involved and together form a self-enhancing amplification loop (Fig. 3A). The first complex involved is the RITS complex, which consists of the Argonaute protein Ago1, the chromodomain protein Chp1 and the self-associating adaptor protein Tas3 (Li et al., 2009). This complex is thought to determine the genomic location of the heterochromatin through association of Ago1 to a locus. The second complex, the RNA-directed RNA polymerase complex (RDRC), contains the RNA-directed RNA polymerase Rdp1, the helicase Hrr1 and the nucleotidyltransferase Cid12 (Motamedi et al., 2004; Verdel et al., 2004). It mediates the amplification of small RNAs from the targeted locus. The third complex is the Clr4-containing complex (CLRC), which establishes the heterochromatic mark.
The effect of targeting pericentromeric repeats by RNAi is twofold. First, transcripts derived from the repeats are degraded by Dicer and therefore unavailable for translation. This phenomenon is referred to as co-transcriptional gene silencing (CTGS) or cis-post-transcriptional gene silencing (cis-PTGS). Second, the RITS complex recruits histone deacetylase activity and H3K9 methylation activity, which results in transcriptional gene silencing (TGS) [for recent reviews, see Cam et al. and Moazed (Cam et al., 2009; Moazed, 2009)].
Although paradoxical, transcription from the heterochromatic locus is key to its silencing. Pericentromeric transcripts can be detected biochemically and are recognized by RITS complexes that are preloaded with a pericentromeric siRNA. RITS then recruits RDRC, which brings an RdRP into the vicinity of the transcript, leading to the synthesis of dsRNA on the transcript strand. This dsRNA is processed by Dicer into new siRNAs that can load onto additional RITS complexes, thereby reinforcing the silencing (Colmenares et al., 2007). A second reinforcement loop is established by the histone methylase Clr4, which associates with RITS (Sugiyama et al., 2005; Zhang et al., 2008) via the LIM-domain protein Stc1 (Bayne et al., 2010). Clr4 activity results in H3K9 methylation, which attracts the HP1 homolog Swi6, leading to condensation of the DNA and stabilizing the association of RITS with the locus through binding of Chp1 to modified histone H3. Remarkably, Clr4 is required for the interaction between RITS and RDRC, suggesting that RITS and RDRC might only be able to interact in a heterochromatic context.
Both RITS and RDRC are continuously required to maintain silencing of the pericentromeric region (Volpe et al., 2002) and for pericentromeric structure; mutation of RITS or RDRC components was shown to lead to increased errors in chromosome segregation due to premature loss of centromeric cohesion (Volpe et al., 2003; Win et al., 2006). Nevertheless, two sets of experiments have shown that, in the end, most RNAi factors are dispensable for centromere function. First, the tethering of Tas3 to a locus was shown to cause recruitment of all components required to start an amplification loop and a silencing response (Buhler et al., 2006). This reaction still depends on active RNAi machinery, but bypasses the need for a triggering siRNA. Second, direct tethering of Clr4 to a locus was found to render the entire RNAi machinery dispensable (Kagansky et al., 2009). Therefore, the RNAi machinery appears to be an elegant way to define heterochromatic pericentromeric domains without the need for sequence conservation (Folco et al., 2008; Ishii et al., 2008). However, in itself it is dispensable for accurate loading of centromere protein A (CENP-A) (see Box 2) and centromere function.
Organization of holocentromeres in C. elegans
Although C. elegans chromosomes are holocentric and therefore the centromere function is spread along the entire chromosome, an RNAi-related mechanism similar to the one described for S. pombe seems to be in place (Fig. 3B) (Claycomb et al., 2009; Gu et al., 2009; van Wolfswinkel et al., 2009). More specifically, an RdRP (EGO-1) interacts with a nucleotidyltransferase (CDE-1) and a helicase (DRH-3), possibly indicating the presence of an RDRC-like complex. In terms of a RITS-like complex, only the core protein, the Argonaute CSR-1, is currently known to interact with the RDRC-like complex in C. elegans, but it is likely that a chromatin-binding protein is also involved in RNAi-directed chromatin modification. In fact, several such proteins were identified in a screen that also identified CDE-1 and CSR-1 (Robert et al., 2005), although none of these proteins has yet been found to interact with CSR-1 or with any of the other centromeric RNAi proteins.
Similar to the situation in S. pombe, loss of any of the above-mentioned proteins results in severe chromosomal mis-segregation in C. elegans. Remarkably, however, the regions targeted by the C. elegans RITS- and RDRC-like complexes seem to be coding regions and not repetitive sequences, as in S. pombe. Consequently, instead of overlapping with H3K9 methylation, CSR-1 localization corresponds to regions that are H3K4-methylation enriched. As C. elegans chromosomes are holocentric and no conserved centromeric sequence is known, it is currently unknown whether the coding regions function as the pericentromeric sequence in C. elegans or whether another centromeric organization is in place. It is tempting to speculate that C. elegans uses the presence of active genes in DNA as a mark that the DNA must be reliably segregated during cell division.
Satellite centromeres in plants and mammals
In many organisms, centromeric domains are less well defined than in S. pombe. Both human and Arabidopsis centromeres, for example, consist of long stretches of satellite repeats, some of which belong to the centromere core domain and some to the pericentromeric region (see Box 2).
Arabidopsis pericentromeric repeats are transcribed and siRNAs derived from these sequences are abundant. Moreover, increased levels of siRNAs are associated with hypermethylation of centromeric DNA (Chen, M. et al., 2008). Two types of repeats (cen180 and the Athila-transposon-derived 106B) are most prominent in the Arabidopsis centromeric region and, interestingly, they appear to be silenced by two different mechanisms.
Cen180 repeats are H3K9 methylated, which requires the histone methyltransferase KYP. The biogenesis of siRNAs derived from cen180 repeats depends on RDR2, DCL3, DDM1 and probably an Argonaute, although it has not yet been identified. Mutation of RDR2, DCL3 or DDM1 results in the absence of cen180-repeat-derived siRNAs, but has no effect on H3K9 methylation of the corresponding loci, suggesting that an RNAi-independent silencing mechanism must be at work to maintain silencing of cen180 repeats.
The 106B repeats are also H3K9 methylated and are targeted by siRNAs. The biogenesis of siRNAs derived from these repeats is dependent on RDR2 and DCL3, but not on DDM1. In this case, the Argonaute involved is known to be AGO1 (May et al., 2005). This difference in requirements for siRNA production between cen180 and 106B repeats is probably due to the fact that production of the 106B-repeat-derived siRNAs appears to be initiated from adjacent long terminal repeat (LTR) transposons. Interestingly, loss of AGO1 results in reduced levels of H3K9 methylation at the pericentromeric region, suggesting that 106B-repeat-derived siRNAs are causal to the histone modifications.
Together, the current data suggest that cen180 repeats are constitutively silenced by an RNAi-independent mechanism, whereas 106B repeats are silenced de novo, which requires the small-RNA trigger. How this relates to the presence of siRNAs from the cen180 repeats is not entirely clear, but it is possible that these siRNAs are a consequence of the silencing, rather than a cause. Moreover, no defects in centromere function have yet been attributed to small-RNA-related events; therefore, the functional relevance of centromeric siRNAs in Arabidopsis is not entirely clear.
Human centromeres can span several megabases of sequence and consist primarily of α-satellite repeats. Much of this repeat sequence can be deleted without interrupting centromere function, but formation of de novo centromeres is most efficient on these specific satellite stretches (Harrington et al., 1997; Ikeno et al., 1998). Human centromeric regions are also associated with H3K9 methylation (Rosenfeld et al., 2009) and RNA has been shown to play a role in kinetochore integrity during mitosis (Chueh et al., 2009; Wong et al., 2007). It has been suggested that artificially introduced siRNAs can induce H3K9 methylation (Kim et al., 2006; Weinberg et al., 2006) and that this might be followed by DNA methylation (Hawkins et al., 2009; Morris et al., 2004), although additional studies are required to confirm these observations. Moreover, the formation of neocentromeres on a human chromosome was shown to require the presence of Dicer (Fukagawa et al., 2004). Together, these observations suggest that small RNAs are in some way involved in mammalian centromeric chromatin, but none of the molecular steps involved has yet been identified. It is possible that small RNAs are mainly involved in the initial establishment of centromere function, as reflected by neocentromere formation. In this context, it would be interesting to evaluate the role of small RNAs in the segregation of the equine chromosome 11. This chromosome has an evolutionarily young centromere that is devoid of centromeric satellites (Wade et al., 2009), raising the possibility that it relies more heavily on small RNAs for its structure and function.
Meiotic silencing of unpaired chromatin
Unpaired chromatin can induce checkpoint activation during meiosis. In species in which sex is determined by sex chromosomes, one of the sexes generally contains a chromosome that is unable to pair with a perfect homolog and therefore poses a threat to meiotic progression. In mammals, these chromosomes are silenced and form a highly condensed sex body.
Many lines of evidence link meiotic silencing of unpaired chromatin (MSUC) to small-RNA-mediated silencing. In the filamentous fungus Neurospora, any region of a chromosome that is unable to pair, or shows less than 95% identity to its pairing partner, is silenced by H3K9 methylation during meiosis (Pratt et al., 2004; Shiu et al., 2001). This silencing mechanism requires the RdRP Sad-1, the Argonaute Sms-2 and the Dicer-like protein Sms-3 (Alexander et al., 2008; Lee et al., 2003), strongly indicating that this mechanism is related to RNAi, although the involvement of small RNAs in this process has not yet been shown.
Most other organisms apply less stringent inspection of their genome during meiosis, although unpaired sex chromosomes are usually distinguished from the pairing autosomes. In mammals, the sex chromosomes in males are H3K9 methylated and separated from the autosomes in a region referred to as the sex body (Khalil et al., 2004; Monesi, 1965). The HMG-box protein Maelstrom was found to localize to the sex body, as well as to other unsynapsed chromosomal regions (Costa et al., 2006), and is also linked to the piRNA pathway in both mouse and Drosophila (Findley et al., 2003; Soper et al., 2008). This suggests that MSUC might be connected to small RNAs in animals as well. However, in addition to the H3K9me2 mark, chromatin of the sex body retains several marks associated with double-stranded breaks (DSBs) (Handel, 2004; Mahadevaiah et al., 2001). Several proteins that are involved in crossover formation and DSB repair (including BRCA1 and ATR) are also involved in MSUC in mouse (Baarends et al., 2005; Turner et al., 2004; Turner et al., 2005). Moreover, part of the induction of MSUC seems to be triggered by the presence, or possibly the longer persistence, of unrepaired DNA breaks on unsynapsed chromosomes (Mahadevaiah et al., 2008; Schoenmakers et al., 2008). As the piRNA pathway is involved in transposon silencing and therefore affects the occurrence of DSBs, further analysis will be required to dissect the contribution of small RNAs to MSUC in mouse. In this light, it is interesting to mention a recent finding in Neurospora, which concerns a novel class of small RNAs, named qiRNAs (Lee et al., 2009). These small RNAs were found to be associated with DNA damage and were suggested to play a role in downregulating protein synthesis upon induction of DNA damage, but they might, in addition, play a role in meiotic silencing.
In C. elegans, the unpaired male X chromosome is also marked by H3K9 methylation and is condensed into a compact body during the pachytene stage of meiosis (Bean et al., 2004), similar to the mammalian sex body. The presence of the H3K9 mark is related to synapsis; in a him-8 mutant, which has an X-chromosome-specific pairing defect, both hermaphrodite X chromosomes are marked (Bean et al., 2004). Interestingly, the zinc-finger protein HIM-17, which functions in concert with the meiotic recombination protein SPO-11 to create DSBs during crossover formation, is required for MSUC in C. elegans (Reddy and Villeneuve, 2004), again supporting a connection between MSUC and meiotic DSB formation. The presence of SPO-11 itself, however, is not required, indicating that it is not the meiotic DSBs themselves that are essential for MSUC.
Genetic data suggest that C. elegans MSUC is also closely related to mechanisms involving small RNA. Mutation of the RdRP EGO-1 results in the absence of the silencing mark (Maine et al., 2005), suggesting a parallel between the mechanisms of MSUC in C. elegans and in Neurospora. The other RDRC- and RITS-like proteins introduced above are also involved. In contrast to the loss of EGO-1, however, loss of any of these other proteins (CSR-1, DRH-3, EKL-1 and CDE-1) leads to ectopic H3K9 methylation, whereas synapsis is not affected [(She et al., 2009); J.C.W. and R.F.K., unpublished data]. This suggests that the RDRC- and RITS-like proteins are required to keep the EGO-1 amplification localized and focused. However, a double mutant consisting of any of these genes combined with ego-1 retains the ectopic H3K9 methylation, showing that the methylation mark can be formed in the absence of EGO-1 (She et al., 2009). It therefore seems that two different pathways – a DSB-related pathway and an EGO-1-centered pathway – interact to determine the localization of H3K9 methylation. An intriguing possibility is that, on one side, DSB remains trigger H3K9 methylation, possibly mediated by small RNAs. On the other side, H3K9 methylation can be expelled from the autosomes by the siRNAs made by EGO-1, through the CSR-1 pathway (Fig. 4). Further analysis of MSUC in C. elegans will be required to test this hypothesis and reconcile all observations, possibly allowing us to obtain new insights that can be extended to the mechanism of MSUC in mammals.
Paramutation
An enigmatic phenomenon known as paramutation has been the subject of investigation in plants for many decades. In paramutation, an expressed naive allele (the paramutable allele) can be modified into a silenced paramutated allele through interaction in a heterozygous situation with a paramutagenic allele. The paramutated allele is typically stable and inheritable, and is paramutagenic when placed in a heterozygous situation with a paramutable allele. Importantly, no sequence changes are involved in the shift from paramutable to paramutated allele – the effect is purely epigenetic. Several examples of paramutable loci have been described in maize, including the b1 locus, which regulates the synthesis of the anthocyanin pigment. The paramutable B-I allele is transcriptionally active and results in dark pigmentation. When crossed to the light-pigmented paramutagenic B′ allele, all progeny have light pigmentation, and crossing of the first filial (F1) generation to new naive B-I again leads to a completely light-pigmented F2 generation. Interestingly, the sequence that causes this effect is a repetitive region located ~100 kb upstream of the b1 locus. The strength of the effect correlates with the number of repeats (Stam et al., 2002a; Stam et al., 2002b). In contrast to the typical situation in other systems, the repetitive region associated with the active B-I allele is hypermethylated but open, whereas in the silenced B′ allele, DNA methylation is reduced and the chromatin is compacted. Repetitive or transposon-related sequences also play an important role at other paramutable loci, although the exact organization varies considerably.
There are several indications that an RNA intermediate is involved in paramutation. The repetitive region upstream of the b1 locus is subject to bidirectional transcription, and the RdRP MOP1 is required for paramutation at this locus and several other loci in maize (Alleman et al., 2006; Dorweiler et al., 2000). In addition, at the paramutable pl1 locus, it was found that a homolog of the previously mentioned DRD1 chromatin factor, RMR1, and Pol IV are involved (Erhard et al., 2009; Hale et al., 2007; Hollick and Chandler, 2001; Sidorenko et al., 2009), suggesting a mechanism related to that of transposon silencing in Arabidopsis.
Paramutation is probably not limited to plants; similar phenomena have been described in mouse involving the agouti locus, a manipulated version of the imprinted Rasgrf1 locus and the LacZ-inserted Kit locus (for reviews, see Chandler, 2007; Cuzin et al., 2008). In the case of the Kit locus, a role for a small-RNA intermediate has been suggested (Rassoulzadegan et al., 2006). In humans, several epimutations with some degree of non-Mendelian inheritance have been identified, including a hereditary non-polyposis colorectal cancer (HNPCC)-associated MLH1 epimutation (Hitchins et al., 2007).
It is too early to propose a unifying mechanism for these diverse observations, but it is tempting to speculate that heritable gene-silencing events triggered by repetitive sequences or by other small-RNA-generating regions occur in all eukaryotic organisms. Although examples in animals are currently rare and enigmatic, it is notable that small-RNA-mediated silencing events discovered in one branch of the eukaryotic tree have, in multiple cases, been found to exist in other branches as well. Therefore, it is likely that phenomena similar to paramutation will also eventually be found to occur in humans.
DNA elimination in ciliates
The regulatory potential of small RNAs is not limited to chromatin modification, but can involve more drastic effects. The ciliates Tetrahymena and Paramecium exhibit an extreme level of small-RNA-mediated silencing, involving the elimination of entire transposable and repetitive elements from their active ‘somatic’ genomes (Fig. 5) [for a recent review, see Duharcourt et al. (Duharcourt et al., 2009)].
Ciliates are characterized by the presence of two types of nuclei in a unicellular organism: a transcriptionally inactive diploid micronucleus, which represents the germ line, and an active polyploid macronucleus, which functions as soma. The macronucleus hosts gene expression during the vegetative state, but is lost during sexual reproduction, after which it is replaced by a macronucleus that is newly formed from the zygotic micronucleus. Elimination of micronucleus-specific sequences from the macronucleus is mediated by a special class of small RNA named scanRNAs. The 25-30-nucleotide scanRNAs are synthesized during meiosis from the entire micronuclear genome and associate with Piwi-like Argonaute proteins. Also during meiosis, long non-coding transcripts (ncRNAs) are synthesized from the entire maternal macronuclear genome. Interactions between the scanRNAs and the ncRNAs result in the elimination of all scanRNAs that recognize a target, leaving only the micronucleus-specific scanRNAs. Upon formation of a new zygotic macronucleus, these remaining scanRNAs identify the micronucleus-specific sequences, which are then marked by H3K9 methylation (Liu et al., 2004; Liu et al., 2007; Taverna et al., 2002) and eliminated by a specific endonuclease. Histone modification is thought to guide DNA elimination (Taverna et al., 2002), although the precision of sequence excision in Paramecium exceeds the nucleosome scale, suggesting that H3K9 methylation only assists in recruiting the endonuclease to the target site. The scanRNAs are required both for the establishment of histone modifications and for DNA elimination, and are therefore thought to induce DNA elimination (Malone et al., 2005; Mochizuki and Gorovsky, 2005).
DNA elimination allows ciliates to control transposable elements – eliminating them from their active genomes limits the opportunity for them to ‘jump’. Although this type of communication between two nuclei within the same cell is unique to ciliates, some level of interaction between somatic and germ-line nuclei has also been observed in multicellular organisms. In Arabidopsis, the pollen vegetative nucleus produces high levels of transposon-directed siRNAs that are subsequently loaded into the mature sperm, thereby protecting the future zygote against transposition (Slotkin et al., 2009). Also, in Drosophila, the zygote is protected by somatic parental contribution of small RNA (Brennecke et al., 2008; Malone et al., 2009). These examples indicate that the principle of small-RNA crosstalk between different nuclei and cells might be a widely used concept in genome protection.
The paradox of active heterochromatin
In many of the mechanisms discussed thus far, small RNAs are derived from heterochromatic sequences and play a role in maintaining these regions in an inactive state. To produce small RNAs, however, these heterochromatic regions need to be transcribed. Moreover, target recognition by small RNAs is also thought to occur on nascent RNA rather than on DNA, implying that these loci cannot be completely inactive. In Arabidopsis, silenced transposable elements are transcribed by the specialized polymerases Pol IV and Pol V, suggesting that transcription of heterochromatin requires special features of a polymerase. These two plant-specific polymerases were identified as early duplications of the major Pol II subunit (Luo and Hall, 2007). Although specialized polymerases have not yet been identified in animals or fungi, the involvement of Pol II subunits in heterochromatin maintenance is more widespread. In S. pombe, the transcription of pericentromeric heterochromatin was found to require several Pol II subunits (Djupedal et al., 2005; Kato et al., 2005). In Drosophila, interaction between Pol II subunits and the RNAi machinery was detected (Kavi and Birchler, 2009). This opens up the possibility that Pol II variants might exist in other eukaryotes that establish the same effect as Pol IV and Pol V in plants.
This paradoxical activity of heterochromatin indicates that the distinction between heterochromatin and euchromatin is far from black and white, and that heterochromatin might be more dynamic than we had anticipated. S. pombe centromeric transcription and heterochromatin were indeed found to be dynamically regulated throughout the cell cycle (Chen, E. S. et al., 2008; Kloc et al., 2008). During mitosis and G1 phase, the heterochromatin protein Swi6 is progressively lost from the chromatin, allowing temporary transcription of pericentromeric regions. As a result, pericentromeric transcripts accumulate during S phase and are rapidly processed into siRNAs to re-establish heterochromatin and cohesin patterns for the next division.
A recent finding in Drosophila has also suggested that different variants of HP1 are used to mark the transcriptional activity of different heterochromatic regions (Klattenhoff et al., 2009). The germline-specific HP1 variant Rhino was found to localize to a piRNA locus, replacing the generic HP1, and is required for piRNA production from that locus. Although it is currently not known whether this finding extends to all piRNA loci or whether this mechanism is conserved in other species, it indicates that heterochromatin comes in different flavors and that more nuances in heterochromatin activity are to be expected.
Concluding remarks
The examples discussed here clearly indicate that small RNAs can have direct effects on chromatin. Phenomena involving the regulation of chromatin by small non-coding RNAs are widespread and very diverse. Different classes of small RNAs can mediate chromatin modification and organisms can use different molecular solutions to obtain a similar biological result. The abundance of small RNAs suggests that additional new classes and subclasses of small RNAs will probably be discovered in the future. It is possible that these discoveries will shed light on even more chromatin-modifying mechanisms or help to detangle pathways for phenomena that currently are enigmatic.
Although different types of sequences can be targeted by small-RNA-mediated silencing mechanisms, it is generally a property of the targeted sequence (e.g. repetitiveness or pairing), rather than the nucleotide sequence itself, that triggers the response. This, in a nutshell, is the unique capacity of small-RNA-mediated regulation: the ability to modulate sequences using an operational definition. Sequences that need to be silenced are defined based on their properties, and the relevant small RNAs can be produced and/or stabilized ‘on the fly’. In many instances, small-RNA-based mechanisms seem to initiate rather than maintain a specific form of chromatin, suggesting that, when maintenance mechanisms fail or are actively evaded, small RNAs are there to reinitiate the required chromatin state. Small RNAs therefore constitute a true genome-surveillance mechanism.