ABSTRACT
Cell-cell interactions mediated by the Notch receptor play an essential role in the development of the Drosophila adult peripheral nervous system (PNS). Transcriptional activation of multiple genes of the Enhancer of split Complex [E(spl)-C] is a key intracellular response to Notch receptor activity. Here we report that most E(spl)-C genes contain a novel sequence motif, the K box (TGTGAT), in their 3′ untranslated regions (3′ UTRs). We present three lines of evidence that demonstrate the importance of this element in the post-transcriptional regulation of E(spl)-C genes. First, K box sequences are specifically conserved in the orthologs of two structurally distinct E(spl)-C genes (m4 and m8) from a distantly related Drosophila species. Second, the wild-type m8 3′ UTR strongly reduces accumulation of heterologous transcripts in vivo, an activity that requires its K box sequences. Finally, m8 genomic DNA transgenes lacking these motifs cause mild gain-of-function PNS defects and can partially phenocopy the genetic interaction of E(spl)D with Notchspl. Although E(spl)-C genes are expressed in temporally and spatially specific patterns, we find that K box-mediated regulation is ubiquitous, implying that other targets of this activity may exist. In support of this, we present sequence analyses that implicate genes of the iroquois Complex (Iro-C) and engrailed as additional targets of K box-mediated regulation.
INTRODUCTION
The fruit fly Drosophila melanogaster is covered with a regular and stereotyped array of external mechanosensory organs, which constitute the majority of its peripheral nervous system (PNS). The determination and differentiation of these sensory organs takes place during late larval and early pupal development (Hartenstein and Posakony, 1989; Huang et al., 1991; Usui and Kimura, 1993), in a process that has been well characterized both genetically and molecularly. Briefly, the spatially patterned activity of the basic helix-loop-helix (bHLH) transcriptional activators achaete (ac) and scute (sc) define ‘proneural clusters,’ or groups of cells that are competent to adopt a sensory organ precursor (SOP) fate (Cabrera and Alonso, 1991; Cubas et al., 1991; Skeath and Carroll, 1991; Van Doren et al., 1992). Then, inhibitory cell-cell interactions requiring the function of the neurogenic genes restrict the SOP fate to a single cell in each proneural cluster (Dietrich and Campos-Ortega, 1984; Hartenstein and Posakony, 1990; Parks and Muskavitch, 1993; Schweisguth and Posakony, 1992; Tata and Hartley, 1995). This ‘lateral inhibition’ process is mediated largely by signaling through the Notch receptor. Notch signaling is further required for the cell fate asymmetry of each of the three cell divisions of the SOP lineage, which yield the four distinct component cells of a mechanosensory organ (Hartenstein and Posakony, 1990; Posakony, 1994).
Recently (Lai and Posakony, 1998; Leviten et al., 1997), we described the Brd box and the GY box, sequence motifs located in the 3′ UTRs of many genes functioning or implicated in Notch signaling, including Bearded (Brd) (Leviten and Posakony, 1996) and multiple genes of the Enhancer of split complex [E(spl)-C]. The E(spl)-C encodes seven bHLH repressor proteins, which function as effectors of the lateral inhibitory signal (Klämbt et al., 1989; Schrons et al., 1992). This complex also includes a gene, m4, the protein product of which is novel but related to the protein encoded by Brd (Leviten et al., 1997). Another member of the E(spl)-C is groucho (gro), which encodes a ubiquitously expressed nuclear protein that functions as a transcriptional co-repressor for bHLH and other DNA-binding repressor proteins (Fisher and Caudy, 1998). Several E(spl)-C genes, including m4, are known to be directly activated in proneural clusters by the transcription factor Suppressor of Hairless in response to Notch receptor activity (Bailey and Posakony, 1995; Furukawa et al., 1995; Lecourtois and Schweisguth, 1995). Brd is likewise expressed specifically in proneural clusters, under ac/sc control (Bailey and Posakony, 1995; Leviten et al., 1997; Singson et al., 1994). Moreover, gain-of-function alleles of Brd confer a bristle ‘tufting’ phenotype indistinguishable from that produced by loss-of-function mutations in genes of the Notch pathway (Leviten and Posakony, 1996), suggesting that Brd has a normal role in lateral inhibition. The Brd box motif shared by these genes negatively regulates protein and transcript levels, and mutation of Brd and GY boxes in a Brd transgene renders it hyperactive and capable of interfering with normal PNS development (Lai and Posakony, 1997). Thus, the Brd box appears to have an important function in controlling the activities of two families of genes involved in Notch signaling.
Here we report the identification of a novel hexamer motif, the K box (TGTGAT), that is present in the 3′ UTRs of seven of the nine genes of the E(spl)-C. We also describe an additional sequence, the CAAC motif, which is closely associated with Brd, GY and K boxes in these 3′ UTRs. Consistent with previous observations suggesting that the E(spl)m8 3′ UTR is associated with negative regulatory activity (Kramatschek and Campos-Ortega, 1994), we find using a heterologous reporter gene assay that the m8 3′ UTR mediates a spatially and temporally general form of negative regulation.
This activity is largely mediated by K boxes, which act principally at the level of transcript accumulation. Using genomic E(spl)m8 transgenes, we show that loss of K box-mediated regulation interferes with normal PNS development. In particular, m8 transgenes lacking these motifs confer characteristic gain-of-function phenotypes and also exhibit genetic interactions with Hairless (H) and the split allele of Notch. Finally, we present sequence analyses that strongly suggest that engrailed and genes of the iroquois Complex (Iro-C) are additional targets of K box-mediated regulation.
MATERIALS AND METHODS
Drosophila stocks
w1118; P[w+, arm-lacZ-SV40 t] transgenic fly lines are described in Lai and Posakony (1997). E(spl)D is described in Lindsley and Zimm (1992), as are HE31/TM6B and y w Nspl, utilized herein for genetic interaction studies involving E(spl)m8 genomic DNA transgenes.
Molecular biology
General molecular biology techniques were performed as described (Ausubel et al., 1987). Preparation of RNA and protein extracts, and probing and quantitation of northern and western blots, were performed as described in Lai and Posakony (1997). The probe for detection of E(spl)m8 transcripts (Fig. 8) was a 1.05 kb BglII fragment that includes nearly all of the m8 transcription unit.
Plasmid construction
Mutant E(spl)m8 3′ UTRs
A 0.95 kb EcoRV-ClaI subclone containing the C-terminal region of E(spl)m8, its entire 3′ UTR and approximately 300 bp of downstream genomic DNA sequence was used as a template for site-directed mutagenesis. Primers (sequences available upon request) were used to generate the following mutant E(spl)m8 3′ UTRs, with novel diagnostic restriction sites introduced as indicated: m8 K/CAAC del (deletes all nucleotides between and including CAAC #1 and CAAC #5; nt 695-736 of the transcription unit); m8 K1K2 mut (changes K #1 from tcTGTGATag to tcCTCGAGag [XhoI] and K #2 from acTGTGATgg to acCCCGGGgg [SmaI]; and m8 5x CAAC mut (changes CAAC #1-3 from acCAACCAACAACgc to acCACTTAAGTACgc [AflII], CAAC #4 from ccCAACtg to ccGTACtg, and CAAC #5 from tggCAACaa to tggCCAGaa [MscI]). The specificity of all mutations was confirmed by sequencing.
arm-lacZ-m8 3′ UTR constructs
E(spl)m8 3′ UTRs were excised from the wild-type and three mutant constructs as NarI-BglII fragments, which contain the last 9 nt of the m8 coding region, the entire 3′ UTR and approximately 150 bp downstream of the polyadenylation signal. These fragments were filled in with Klenow polymerase and cloned into the HincII-EcoRV sites of pBluescript with the proximal region closer to the KpnI site in the polylinker. These 3′ UTRs were then cloned into pCaSpeR-ac-lacZ (Lai and Posakony, 1997) as KpnI-EcoRI fragments. The ac promoter was then replaced by the 1.8 kb arm promoter to create arm-lacZ-m8 wt, arm-lacZ-m8 K/CAAC del, arm-lacZ-m8 K1K2 mut and arm-lacZ-m8 5x CAAC mut.
E(spl)m8 genomic DNA constructs
A 2.6 kb EcoRI-ClaI wild-type E(spl)m8 genomic DNA fragment was subcloned from a larger genomic fragment described by Knust et al. (1987). The 0.95 kb EcoRV-ClaI mutant 3′ UTR fragments replaced the corresponding wild-type fragment in the 2.6 kb subclone to create m8 2.6 K/CAAC and m8 2.6 K1K2 mut. The wild-type and mutant E(spl)m8 genomic DNAs were then cloned into CaSpeR-K (Margolis et al., 1995) as KpnI-XbaI fragments to yield m8 wild-type, m8 K/CAAC del and m8 K1K2 mut P-element-transformation constructs.
Germline transformation
The P-element-transformation constructs described above were coinjected with a Δ2-3 helper plasmid into the recipient strain w1118 (Rubin and Spradling, 1982). For each construct, 7-16 independent homozygous transgenic lines were analyzed.
β-galactosidase activity staining
Imaginal discs and other organs were dissected from late third-instar larvae and stained for β-galactosidase activity as described (Romani et al., 1989). Preparations were dehydrated in an ethanol series and mounted in Epon.
DNA sequence searches
Initial searches for sequence elements among E(spl)-C 3′ UTR sequences were carried out with the CONSENSUS program [(Hertz et al., 1990); Versions 4a of CONSENSUS and 3d of WCONSENSUS: G. Hertz, personal communication], which systematically – using a matrix consensus representation and information content analysis – scans a set of sequences for oligomers common to them.
RESULTS
Two new classes of 3′ UTR sequence motifs in genes of the E(spl)-C and Brd
Recently, we reported the identification of two novel classes of sequence motifs in the 3′ UTRs of Brd and most genes of the E(spl)-C, the Brd box (AGCTTTA) and the GY box (GTCTTCC) (Lai and Posakony, 1998; Leviten et al., 1997). Our investigation of the in vivo function of these elements has also been reported recently (Lai and Posakony, 1997). Here, we describe two additional classes of sequence elements in the 3′ UTRs of these genes, the K box and the CAAC motif.
Like the Brd box and the GY box, the K box is an exact sequence identity present in one or more copies in the 3′ UTRs of Brd and several genes of the E(spl)-C (Fig. 1A). A core hexamer sequence (TGTGAT; rich in keto nucleotides) defines the K box element, and there are additional strong biases in the 5′ and 3′ flanking nucleotides. While most of the genes listed in Fig. 1A contain single K boxes in their 3′ UTRs, two genes [E(spl)m8 and E(spl)mδ] each contain a pair of closely spaced K boxes; this arrangement may be of particular consequence. With a single exception [in the coding region of E(spl)m5], all K boxes on the sense strand of these genes reside in their 3′ UTRs; moreover, within the 3′ UTRs, the K box sequence occurs only in the sense orientation. Thus, there is significant specificity to both the location and the orientation of the K box.
The K box sequence element appears not to have any consistent positional relationship with either Brd boxes or GY boxes, either in the linear sequence (Fig. 2) or in predicted secondary structures of these transcripts (data not shown). However, a fourth sequence element, the CAAC motif (CAAC) does link a subset of these other motifs. As shown in Fig. 1B, the tetranucleotide CAAC is tightly associated with one third of Brd, GY and K box elements. Moreover, the second Brd box in E(spl)m4, the third Brd box in E(spl)m5, and the first K box of E(spl)m8 are all associated with multiple CAAC motifs. CAAC motifs are specifically concentrated upstream of Brd, GY and K boxes (Fig. 1B) and could conceivably influence their activities. Though there does not appear to be a strict spacing requirement between CAAC motifs and the Brd, GY or K boxes, CAAC sequences are found immediately upstream in more than half of the CAAC-Brd/GY/K pairs, with adjacent CAAC-K box pairs most prevalent (Fig. 1B). Overall, although this short motif occurs with essentially random frequency in the 3′ UTRs of E(spl)-C genes and Brd, its presence is highly enriched in the vicinity of Brd, GY and K boxes.
Evolutionary conservation of K boxes and CAAC motifs
The strict identity of the K box sequences in Brd and the genes of the E(spl)-C suggests that each nucleotide of this motif is important for its function. If this is the case, one might expect its precise sequence to be conserved over the course of Drosophila evolution. D. hydei is approximately 60 million years diverged from D. melanogaster (Beverley and Wilson, 1984), sufficiently distant for substantial sequence randomization to occur outside of coding and regulatory regions (Caccone and Powell, 1990). We analyzed the sequences of the 3′ UTRs of two genes from D. hydei, E(spl)m4 (a non-bHLH gene) and E(spl)m8 (a bHLH gene), and found that K boxes and their flanking nucleotides are fully conserved in all three instances [Figs 1 and 3; see Lai and Posakony (1997)]. In particular, the most conserved region between the m8 3′ UTRs includes the pair of K boxes (Fig. 3), suggesting that this region is indeed of functional significance. Otherwise, these 3′ UTR sequences are quite divergent.
Although neither the exact location nor the number of CAAC motifs is completely conserved, this sequence is found upstream of m4 B2, m8 K1 and m8 K2 in both species, while a single GY box in the 3′ UTR of each m4 ortholog is also associated with a CAAC motif (Fig. 1). It is noteworthy that all eight CAAC motifs in the approximately 600 bp of combined m8 3′ UTR sequence are clustered around the K boxes (Fig. 3), further suggesting that the proximity of these motifs is not coincidental.
The K box mediates negative post-transcriptional regulation
To investigate the in vivo function of the K box, we made use of the E(spl)m8 3′ UTR, which includes two evolutionarily conserved copies of this motif and lacks both Brd boxes and GY boxes. An arm-lacZ-SV40 t reporter transgene, utilizing 1.8 kb of the armadillo promoter and the 3′ UTR of the SV40 t antigen gene, is expressed at high levels in most cells throughout development (Lai and Posakony, 1997; Vincent et al., 1994). We compared the reporter activity of this transgene with that of a version that includes the m8 3′ UTR (Fig. 4) and found that the wild-type m8 (m8 wt) 3′ UTR sharply reduces reporter gene activity relative to the SV40 t 3′ UTR (cf. Fig. 5A,F,K and Fig. 5B,G,L). Although E(spl)m8, as well as other genes of the E(spl)-C and Brd, are expressed in a spatially restricted pattern in both embryos and larvae, we observed a drastic and largely uniform reduction in reporter activity throughout larval and imaginal tissues of the third-instar larva as well as throughout embryonic development (Fig. 5 and data not shown). Thus, the E(spl)m8 3′ UTR appears to confer a ubiquitous and temporally general form of negative regulation. To test the specific role of the K boxes in regulation by the m8 3′ UTR, we constructed mutant versions of the reporter transgene in which all K boxes and CAAC motifs were deleted (m8 K/CAAC del), or in which 4 or 5 point mutations were introduced into both K boxes (m8 K1K2 mut) (Fig. 4). As shown in Fig. 5 (C,H,M), deletion of the K/CAAC region restores moderate levels of reporter activity throughout all tissues. Interestingly, all transgenic lines carrying the m8 K1K2 mut construct exhibited significantly higher levels of reporter activity than those of the m8 K/CAAC del lines (Fig. 5D,I,N). Very comparable effects were observed with both of these K box-mutant transgenes during embryonic development, indicating that a spatially and temporally general mode of negative regulation by the m8 3′ UTR requires the integrity of the K boxes (Fig. 5 and data not shown).
To investigate the possible role of the CAAC motifs, we constructed a mutant reporter gene containing 2 bp mutations in each of the five CAAC motifs surrounding the K boxes in the m8 3′ UTR (m8 5x CAAC mut). This mutant 3′ UTR retained strong negative regulatory activity, presumably due at least in part to its intact K boxes. Nevertheless, specific mutation of the CAAC motifs led to a detectable and reproducible increase in reporter activity relative to the m8 wt 3′ UTR (Fig. 5E,J,O). Therefore, CAAC motifs are required for the full regulatory activity of the m8 3′ UTR.
K boxes function primarily at the level of transcript accumulation
As our reporter gene results indicated that K boxes are essential for normal negative regulation by the E(spl)m8 3′ UTR, we were interested to determine at what level such regulation occurs. To address this, we performed northern and western blot analysis of RNA and protein accumulation for sets of five representative arm-lacZ transgenic lines for each 3′ UTR tested. Samples were prepared from 2-26 hour embryos, and reporter levels were quantitated relative to the ubiquitously expressed endogenous genes rp49 (RNA) and groucho (protein).
We found that negative regulation mediated by K boxes is largely effected through decreased transcript levels. As shown in Fig. 6 and Table 1, mutation of the K boxes (m8 K1K2 mut) results in a large increase (approximately 4-fold) in the steady-state level of total lacZ transcripts, as well as a somewhat larger increase (6.5-fold) in the polyadenylated population. Quantitative western blot analysis using 125I-Protein A showed that mutation of the K boxes has an additional effect on the steady-state protein level, leading to a nearly 11-fold difference between the β-galactosidase protein levels from wild-type and K box-mutant reporter gene constructs. Although we cannot rule out a potential effect of the K box on transcription, these data suggest that K boxes act primarily at the level of transcript stability and may secondarily reduce translational efficiency (approximately 2-fold) as well.
We also detected a distinct contribution of the CAAC motifs to regulation by the E(spl)m8 3′?UTR. In particular, specific mutation of the CAAC sequences (m8 5x CAAC mut) was found to result in a >2-fold increase in steady-state reporter transcript level and a comparable increase in reporter protein level. These data suggest that the CAAC motifs also participate in negative regulation by the m8 3′ UTR, probably at the level of RNA stability. Despite this result, we observed that reporter transcript and protein levels from the m8 K/CAAC del construct were intermediate relative to those of the wild-type construct and the K box mutant, in agreement with the β-galactosidase activity staining data (cf. Figs 5 and 6). Although this deletion construct removes essentially only K boxes and CAAC motifs, nonspecific changes in RNA secondary structure or the shortening of the m8 3′ UTR might indirectly affect its metabolism. Thus, the activity of the K/CAAC del 3′ UTR may not be directly predictable from the results with the point mutant constructs. Nevertheless, this deletion does cause a strong increase in steady-state transcript and protein levels (Fig. 6; Table 1), consistent with the loss of K box and CAAC motif regulation from this construct.
Loss of K box-mediated regulation interferes with normal adult PNS development
A previous study has shown that the hyperactivity of the original E(spl)D allele of E(spl)m8 is due to a small deletion at the 3′ end of the gene, including C-terminal codons and the 3′ UTR (Tietze et al., 1992). Since this deletion removes both of the K box elements that we have identified, we were interested to investigate whether loss of K box-mediated regulation is sufficient to allow m8 transgenes to interefere with PNS development. While two copies of a wild-type m8 transgene are not sufficient to perturb sensory organ development (Tietze et al., 1992), overexpression of E(spl)m8 causes pronounced phenotypic defects in the PNS (Nakao and Campos-Ortega, 1996; Tata and Hartley, 1995).
We subcloned a 2.6 kb E(spl)m8 genomic DNA fragment containing 1.17 kb of upstream promoter sequence as our wild-type transgene. This fragment contains all known transcriptional activation sequences for this gene (Bailey and Posakony, 1995; Kramatschek and Campos-Ortega, 1994; Singson et al., 1994), but is approximately 100 bp shorter than the transgene described by Tietze et al. (1992). We also constructed mutant m8 transgenes containing the mutations in the two K boxes (m8 K1K2 mut) or a deletion of the K/CAAC region (m8 K/CAAC del) as illustrated in Fig. 4.
As expected, none of 12 homozygous wild-type transgenes conferred any detectable mutant phenotype in the otherwise wild-type background of the recipient strain. In contrast, 4/14 lines homozygous for the m8 K1K2 mut transgene exhibited highly (>80%) penetrant PNS and/or wing venation defects, while an additional 3 lines exhibited low-penetrance (>20%) defects (Fig. 7A-D). Collectively, these defects included loss of anterior orbital, postvertical and scutellar macrochaetes, loss of interocellar microchaetes, and extra/missing crossvein material. A smaller proportion of the m8 K/CAAC del transgenes exhibited such defects, with only one out of sixteen lines exhibiting a fully penetrant mutant phenotype and three other lines displaying low-penetrance defects. Although these effects are quite mild, they are qualitatively similar to those produced by E(spl)m8 overexpression constructs, indicating that they are likely to be gain-of-function phenotypes (Nakao and Campos-Ortega, 1996; Tata and Hartley, 1995). Therefore, we conclude that K box-mediated regulation is necessary to restrict levels of E(spl)m8 activity in vivo.
K box-mutant E(spl)m8 transgenes display phenotypic interactions with H and spl
To characterize further the in vivo activity of K box-mutant transgenes, we examined their behavior in sensitized genetic backgrounds. Loss-of-function mutations in Hairless (H) increase the activity of Su(H), a transcription factor that directly links Notch receptor activity with transcriptional activation of E(spl)-C genes, including E(spl)m8 (Bailey and Posakony, 1995; Schweisguth and Posakony, 1994). Flies heterozygous for the null allele HE31 display PNS defects consistent with excess Notch signaling, including missing or double-socket sensory organs (Bang et al., 1991). This is particularly evident on the head of the fly (Fig. 7E). Ten independent insertions of each m8 construct were tested in a single copy in a HE31/+ background. The wild-type transgenes caused only a minimal reduction (<5%) in the numbers of wild-type head macrochaetae (Fig. 7F; Table 2). In contrast, the m8 K/CAAC del and m8 K1K2 mut transgenes had pronounced effects in this genetic background, causing approximately 30% and 40% reduction in head macrochaetes, respectively (Fig. 7G; Table 2). Enhancement of the H phenotype is consistent with increased m8 activity conferred by these mutant transgenes.
We also examined the effect of m8 transgenes on the phenotype of Nspl/+ females, whose eyes are largely wild-type. All seven wild-type m8 transgenes failed to enhance spl, even when two copies of the transgene were present [cf. Fig. 7H,L and 7I,M; see also Tietze et al. (1992)]. However, K box-mutant transgenes (four out of six m8 K/CAAC del lines and five out of seven m8 K1K2 mut lines) clearly enhanced spl, producing roughened eyes in spl/+ females homozygous for the transgene (Fig. 7J,N and K,O). In accord with our other assays of m8 3′ UTR activity, the degree of enhancement was greater for the specific m8 K1K2 mut transgene than for the m8 K/CAAC deletion transgene, as the former caused a greater degree of eye roughening and loss of ommatidia than the latter.
The E(spl)D mutation is associated with an elevated steady-state level of transcript
The spl-enhancing effects of K box-mutant E(spl)m8 transgenes (Fig. 7) are reminiscent of those characteristic of the original E(spl)D mutant. In view of our finding that mutation of K box sequences leads to elevated levels of transcript accumulation from lacZ reporter transgenes (Fig. 6; Table 1), we tested the hypothesis that the hyperactivity of the E(spl)D allele might likewise be reflected in increased transcript accumulation. As shown in Fig. 8, we did indeed observe that, in E(spl)D/E(spl)D embryos and third-instar larvae, the truncated mutant mRNA is present at much higher levels than the normal E(spl)m8 transcript in wild-type animals. In contrast to a previous report (Knust et al., 1987), we fail to detect any normal-sized E(spl)m8 mRNA in E(spl)D homozygotes (Fig. 8). Our results are consistent with the interpretation that deletion of K box sequence elements in the E(spl)D mutation contributes to elevated steady-state transcript levels and hence to the hyperactivity of this allele.
DISCUSSION
Multiple modes of post-transcriptional regulation of E(spl)-C genes
The 3′ UTRs of eukaryotic mRNAs, once thought to be unimportant trailers following protein coding regions, often contain sequences that link RNA transcripts to a wide variety of regulatory pathways. 3′ UTRs have now been found to influence polyadenylation efficiency, transcript localization, transcript stability, and translational efficiency [reviewed by Chen and Shyu (1995); Curtis et al. (1995); St. Johnston (1995)]. Multiple regulatory functions are often localized within a single 3′ UTR; indeed, the known complexity of cis-regulatory activities resident in 3′ UTRs is increasingly reminiscent of that long associated with transcriptional regulatory DNA.
We have identified multiple classes of shared sequence motifs in the 3′ UTRs of Brd and the genes of the E(spl)-C, including the Brd box (AGCTTTA), the GY box (GTCTTCC), and the K box (TGTGAT). These motifs subject these genes to multiple modes of post-transcriptional regulation [Lai and Posakony (1997); this work]. In addition, we have identified a fourth sequence, the CAAC motif, that is located upstream of many Brd, GY and K boxes, and may modulate their activity.
In the present study, we have shown that the integrity of the K box sequence is essential for the strong, negative post-transcriptional regulation conferred on a heterologous lacZ reporter gene by the E(spl)m8 3′ UTR, and that the CAAC motif may also participate in such regulation. Within the context of the m8 gene itself, we found that K box elements are required to limit gene activity and that lack of such regulation has phenotypic consequences in the development of the adult PNS. Whereas wild-type m8 transgenes did not detectably affect development and failed to show interactions with H and spl mutations, specific mutation of the two K boxes in an otherwise identical m8 transgene caused mild gain-of-function PNS defects and yielded strong genetic interactions with both H and spl. Taken together, these data strongly support our proposal that the K box sequence mediates negative regulation in vivo, and demonstrate the developmental importance of this regulation.
The cell fate decisions controlled by Notch signaling are sensitive to small changes in gene dosage (Ashburner, 1982; Bang et al., 1991; Heitzler and Simpson, 1991; Schweisguth and Posakony, 1994), implying the importance of precise control over the expression levels of the genes involved, such as the E(spl)-C. Negative post-transcriptional regulation by K box and Brd box elements, which principally affect transcript accumulation and translational efficiency, respectively, are two means by which such fine control may be obtained [Lai and Posakony (1997); this study]. As most genes of the E(spl)-C are apparently subject to negative regulation mediated by both Brd boxes and K boxes (Fig. 2), accumulation of E(spl)-C proteins should be quantitatively and temporally quite responsive to transcriptional regulation.
Alone among the genes of the E(spl)-C, E(spl)mβ appears to lack Brd, GY and K boxes in its 3′ UTR (Fig. 2). Its pattern of expression in wing imaginal discs is also unique: mβ transcripts accumulate in a complex pattern that covers much of the wing imaginal disc and is not coincident with proneural clusters (de Celis et al., 1996), the sites of co-expression of most of the other E(spl)-C genes, including m4, m7, m8, mδ and m? (de Celis et al., 1996; Singson et al., 1994). Moreover, it is likely that mβ functions principally in Notch signaling during the development of the pupal wing, where it is expressed in intervein regions (de Celis et al., 1997). Thus, the unique expression pattern and likely functional specialization of mβ are correlated with its lack of 3′ UTR motifs that are otherwise widely shared within the E(spl)-C. In contrast to the other genes, it may be unnecessary or even deleterious to regulate levels of mβ expression via Brd boxes or K boxes.
Regulatory effects of the E(spl)D mutation
The E(spl)-C was originally identified via the mutant E(spl)D, which displays an allele-specific interaction with Nspl (spl) (Lindsley and Zimm, 1992). In particular, E(spl)D renders the spl eye phenotype dominant in females heterozygous for spl. spl-enhancing activity was localized to a small deletion in the 3′ region of E(spl)m8 that removes the C-terminal 56 amino acid residues, adds 9 novel amino acids and removes 315 nucleotides of the 3′ UTR (Klämbt et al., 1989; Tietze et al., 1992). The regulatory basis of E(spl)D’s effects has been difficult to establish, partly because the behavior of this allele has been inconsistent in different phenotypic assays. For example, enhancement of the spl phenotype behaves as a gain-of-function activity, while high-level overexpression of the E(spl)D protein results in dominant-negative phenotypes. In spite of these apparently contradictory effects, a consistent finding has been that the deletion itself appears to increase the activity of the gene. Thus, a construct that includes only a premature stop codon [E(spl)+, stop] has less spl-enhancing activity as well as less dominant-negative activity in an overexpression assay than comparable constructs that include the deletion [E(spl)D, stop], even though both of these constructs encode the same protein (Giebel and Campos-Ortega, 1997; Tietze et al., 1992).
We have presented several lines of evidence from multiple in vivo phenotypic assays that strongly indicate that a K box-mutant E(spl)m8 genomic DNA transgene behaves as a hypermorph. As mutations were introduced only into untranslated regions, all E(spl)m8 proteins expressed from our constructs are expected to be completely wild type. Notably, we found that K box-mutant m8 transgenes can largely phenocopy the spl-enhancing activity of E(spl)D. This verifies that spl enhancement is a gain-of-function activity and suggests that the hyperactivity of the E(spl)D allele is due in part to its lack of K boxes. Consistent with this interpretation, we have shown here that E(spl)D mutants exhibit elevated levels of accumulated transcript. The observed levels (Fig. 8) are high enough, however, that they cannot be accounted for solely by relief from K box-mediated regulation.
E(spl)D proteins also lack the C-terminal WRPW motif, which is required to recruit the co-repressor Groucho by means of a direct protein-protein interaction (Paroush et al., 1994). Since bHLH repressor proteins function as dimers, it is possible that the single Groucho molecule recruited by an E(spl)D/E(spl)+ heterodimer may be functionally sufficient. In this situation, moderately increased levels of truncated E(spl)D monomers might serve to elevate overall E(spl) protein function, yielding a hypermorphic effect. On the contrary, when the mutant protein is strongly overexpressed, DNA-bound complexes of E(spl)D/E(spl)D homodimers might predominate; these would be predicted to function in a dominant-negative fashion by virtue of their complete inability to recruit Groucho. Loss of K box-mediated regulation under these circumstances would be predicted to result in an exacerbation of dominant-negative activity.
Genes of the Iro-C and engrailed are likely targets of K box-mediated regulation
Since K box-mediated regulation appears to operate in most cells, it is probable that some genes expressed outside of proneural clusters will also prove to be subject to this mode of control. We have identified a number of Drosophila genes with K box motifs in their 3′ UTRs, which represent candidate targets of K box regulation. Those of particular interest include the homeobox genes of the iroquois Complex (Iro-C) and engrailed (en).
The Iro-C, consisting of the genes araucan (ara), caupolican (caup) and mirror, encodes a family of three related homeodomain transcription factors (Gomez-Skarmeta et al., 1996; McNeill et al., 1997). All three genes appear to contribute to a ‘prepattern’ for sensory organs in the wing imaginal disc, and the ara and caup proteins have been shown to directly activate expression of ac and sc (Gomez-Skarmeta et al., 1996; Kehl et al., 1998). Each of the three Iro-C genes contains a single K box in its 3′ UTR (Fig. 9A). Moreover, the K boxes in ara and caup conform to a much more restricted consensus sequence found in many E(spl)-C K boxes (cTGTGATa) (see Fig. 1). The presence of the K box motif in all three Iro-C genes resembles its widespread distribution in the 3′ UTRs of genes in the E(spl)-C. Thus, genes that activate (Iro-C) and repress [E(spl)-C] proneural gene expression may be regulated by the same post-transcriptional mechanism.
The en homeobox gene controls posterior compartment identity in Drosophila and has a fundamental role in metazoan patterning. We have found that the en 3′ UTR contains a K box located within a large block of sequence conserved between the distantly related species D. melanogaster and D. virilis (Fig. 9B) (Kassis et al., 1986; Poole et al., 1985). Furthermore, the en K box-containing region has been conserved between Drosophila and the flour beetle Tribolium castaneum, which are at least three hundred million years diverged (Fig. 9B) (Brown et al., 1994). Notably, the 3′ UTR of Tribolium en contains a pair of K boxes similar to the K box pairs found in Drosophila E(spl)m8 and E(spl)mδ, suggesting that this paired arrangement is of particular significance. The sequence of the novel second K box in Tribolium en is related to the corresponding sequence from Drosophila, suggesting that either Tribolium has acquired a second K box or the dipteran lineage has lost a K box. Overall, the evolutionary conservation of the en K box region among these insects strongly suggests that it is subject to functional constraint.
Gain-of-function alleles of extramacrochaetae and Serrate are associated with loss of K box motifs
We have shown here that loss of K box-mediated regulation from the E(spl)m8 3′ UTR strongly elevates reporter gene transcript and protein levels, and confers readily detectable gain-of-function activities on an E(spl)m8 genomic DNA transgene, including the capacity to substantially phenocopy the spl-enhancing effects of E(spl)D. These findings prompted us to consider whether known gain-of-function alleles of other genes may likewise be associated with loss of K box elements from their 3′ UTRs. A survey of extant hypermorphic mutants identified two such alleles, Achaetous (Ach) and SerrateD (SerD).
extramacrochaetae (emc) encodes an HLH protein that antagonizes the activity of bHLH proneural activators by forming ‘poisoned’ heterodimers unable to bind DNA (Ellis et al., 1990; Garrell and Modolell, 1990; Van Doren et al., 1991). Curiously, the emc 3′ UTR resembles many E(spl)-C 3′ UTRs, as it contains a Brd box, four copies of a GY box-like sequence (GTTTTCC) and a K box. The only known gain-of-function allele of emc, Ach, is associated with a transposon insertion, which deletes the codons for the C-terminal 42 amino acids as well as the entire normal 3′ UTR (Garrell and Modolell, 1990). Ser encodes one of two known ligands for the Notch receptor and functions in pattern formation and cell proliferation in the developing wing (Fleming et al., 1990; Rebay et al., 1991). The hypermorphic allele SerD is associated with a transposon insertion into the 3′ UTR of the gene; the mutant transcript is truncated but encodes a wild-type protein (Thomas et al., 1995). A single K box motif is lost from the 3′ UTR in SerD. Both Ach and SerD are associated with insertion of the Tirant transposon (Garrell and Modolell, 1990; Thomas et al., 1995).
However, the gain-of-function nature of these mutants is unlikely to be due to a strong transcriptional activation effect of this transposon, as transcription of Ser is not increased in SerD (Thomas et al., 1995). Instead, the transposon insertion is thought to result in increased stability of mutant transcripts. We suggest that loss of K box-mediated regulation in Ach and SerD may contribute to their hypermorphic character.
Is there a link between gene duplication and negative regulation mediated by Brd boxes and K boxes?
An unknown but probably very significant fraction of the genes in a typical metazoan genome can be mutated to an inactive state without conferring a mutant phenotype that is readily detectable under laboratory conditions (Miklos and Rubin, 1996). Members of paralogous multigene families are perhaps especially likely to fall into this category. Such genes are frequently referred to as ‘redundant’ because of their apparent functional overlap with other genes, but we may be confident that their evolutionary retention in an active state indicates that they are not truly redundant or otherwise dispensable. Nevertheless, in the absence of a recognizable loss-of-function phenotype, the existence of a particular gene of this type is frequently first revealed by the dominant phenotypic effects of a gain-of-function allele. Such is the case for the genes of the E(spl)-C and for Brd (Leviten and Posakony, 1996; Lindsley and Zimm, 1992).
Negative regulatory mechanisms that limit gene activity are obvious targets for mutations that will yield gain-of-function alleles. We have previously presented evidence that loss of Brd box sequences from the 3′ UTR and the consequent loss of the negative regulation mediated by this motif, accounts at least in part for the gain-of-function properties of the original Brd1 allele (Lai and Posakony, 1997; Leviten et al., 1997; Leviten and Posakony, 1996). In this paper, we have similarly shown that mutation of K box elements in the E(spl)m8 3′ UTR confers a partial phenocopy of E(spl)D.
We are struck by the apparent prevalence of Brd box and K box motifs among genes that are members of paralogous gene families. The 3′ UTRs of five of the seven bHLH repressor-encoding genes of the E(spl)-C (m3, m5, m7, mγ and mδ) contain at least one copy of both of these classes of sequence element. We have suggested (Leviten et al., 1997) that Brd and E(spl)m4 constitute a small gene family. m4 also contains both Brd boxes and K boxes in its 3′ UTR, while Brd contains Brd boxes and possibly a K box (see legend to Fig. 1). Three other examples of duplicated gene sets in Drosophila appear to reinforce this association. First, en and invected, homeobox genes with overlapping functions in controlling posterior compartment identity, contain a K box and a Brd box, respectively, in their 3′ UTRs (Coleman et al., 1987; Poole et al., 1985). Morever, the phylogenetic analysis presented above (Fig. 9) strongly suggests that the K box is important for en regulation. Second, all three genes in the Iro-C contain K boxes in their 3′ UTRs, are co-expressed in the wing imaginal disc and function to activate expression of ac and sc (Gomez-Skarmeta et al., 1996; Kehl et al., 1998). Finally, PKC53E and inaD, two protein kinase C genes in the 53E cytological region, each contain single Brd boxes in their 3′ UTRs (Rosenthal et al., 1987; Schaeffer et al., 1989). Although the normal role of PKC53E is unknown, its expression pattern overlaps that of inaD in the eye and thus it may have a related function. Like the bHLH genes of the E(spl)-C and the Brd/E(spl)m4 family, all of these are examples of protein sets related by primary sequence, expression pattern and putative or known function.
With these considerations in mind, it is intriguing to consider the possible evolutionary significance of negative regulatory motifs and mechanisms such as those exemplified by the Brd box and K box. Whether these elements were present in the ancestral single-copy genes prior to duplication and were then retained by the duplicated gene copies, or whether they were independently acquired by the members of the gene set after duplication, is unclear. In either case, it may be that the presence of these negative regulatory motifs made the gene duplication events more tolerable, by limiting the possibly deleterious gain-of-function effects that could result from the corresponding increase in gene activity. Thus, it is possible that negative post-transcriptional regulation may have not only specific cis-acting effects on gene expression, but also more general consequences for genome evolution.
ACKNOWLEDGEMENTS
The authors wish to thank Spyros Artavanis-Tsakonas for the gift of monoclonal anti-Groucho antibody, Charles Graham of the Scripps Institution of Oceanography for assistance with scanning electron micrography, members of James Kadonaga’s laboratory for assistance with 125I-Protein A western blotting, G. Hertz for discussions about the use of the CONSENSUS tools, and Adina Bailey, Scott Barolo, Karen Berger, Rick Firtel, and Dave Nellesen for critical comments on the manuscript and enlightening discussions. E. C. L. was supported by a fellowship from the Lucille P. Markey Charitable Trust and by a predoctoral fellowship from the NIH. C. B. was supported as a Visiting Scholar under the auspices of DOE and through the UCDRD program of the University of California and Los Alamos National Laboratory. This work was supported by NIH grant GM46993 to J. W. P.