ABSTRACT
The transcription factor SOX2 is a vital regulator of stem cell activity in various developing and adult tissues. Mounting evidence has demonstrated the importance of SOX2 in regulating the induction and maintenance of stemness as well as in controlling cell proliferation, lineage decisions and differentiation. Recent studies have revealed that the ability of SOX2 to regulate these stem cell features involves its function as a pioneer factor, with the capacity to target nucleosomal DNA, modulate chromatin accessibility and prepare silent genes for subsequent activation. Moreover, although SOX2 binds to similar DNA motifs in different stem cells, its multifaceted and cell type-specific functions are reliant on context-dependent features. These cell type-specific properties include variations in partner factor availability and SOX2 protein expression levels. In this Primer, we discuss recent findings that have increased our understanding of how SOX2 executes its versatile functions as a master regulator of stem cell activities.
Introduction
The sex-determining region Y (SRY)-related high mobility group (HMG)-box (SOX) family of transcription factors (TFs) are multifaceted regulators of development and tissue homeostasis. Experimental efforts have demonstrated that, as well as diverse functions in regulating the induction and maintenance of stemness, SOX TFs also have widespread influences on cell proliferation, differentiation and lineage specification (reviewed by Sarkar and Hochedlinger, 2013).
Consistent with their regulatory roles in these fundamental processes, SOX proteins have an ancient evolutionary origin. SOX-like proteins are present in Choanoflagellates, the unicellular relatives of animals, and SOX gene duplications are observed in multicellular organisms, including those distantly related to vertebrates, such as Poriferans and Placozoans, indicating rapid diversification of the SOX family during animal evolution (Guth and Wegner, 2008) (Fig. 1). In mammals, the SOX family consists of 20 different members that share approximately 50%, or more, homology between the amino acid sequence of their DNA-binding HMG domains (Bowles et al., 2000; Schepers et al., 2002) (Fig. 1). The different SOX family members are divided into distinct groups (A-H) based on the homology of their HMG-domain sequences. As a result of sequential genome duplications, most groups consist of several homologous SOX proteins, which typically share more than 80% amino acid identity to each other within their HMG domains, as well as homologies in motifs and functional domains outside their HMG domains (Wright et al., 1993; Bowles et al., 2000).
SOX1, SOX2, SOX3 and the fish-specific Sox19a/b make up the SOX-B1 group. SOX-B1 proteins all have well-described roles in neural development; however, SOX2 is initially expressed in the preimplantation embryo and subsequently in specific cell populations in the embryonic ectoderm, mesoderm and endoderm and in several organs of the fetus and adult (reviewed by Sarkar and Hochedlinger, 2013). Gain- and loss-of-function studies have demonstrated that SOX2 has key roles in the maintenance of stem cell populations in the embryo (Avilion et al., 2003; Bylund et al., 2003; Graham et al., 2003) and the adult, including stem cells in the brain (Favaro et al., 2009), stomach (Arnold et al., 2011), corneal epithelium (Bhattachatya et al., 2019), salivary gland (Arnold et al., 2011; Emmerson et al., 2017) and pituitary gland (Rizzoti et al., 2013; Andoniadou et al., 2013). Consistent with its important role in stem cells, SOX2 is one of the TFs that, together with OCT4 (also known as POU5F1), KLF4 and MYC, were originally shown to convert somatic cells into induced pluripotent stem cells (iPSCs) (Takahashi and Yamanaka, 2006). In addition, SOX2 is involved in regulating cell cycle progression, lineage specification and differentiation (Hagey and Muhr, 2014; Oosterveen et al., 2013; Peterson et al., 2012; Bergsland et al., 2011; Bylund et al., 2003; Graham et al., 2003).
Given these important regulatory capabilities and recent technical advancements that have increased our understanding of SOX2 function, in this Primer we take a mechanistic point of view to discuss how SOX2 achieves its distinct activities, and how these are regulated in a context-specific manner. Although we focus on the function of SOX2, several other SOX proteins have redundant activities within the same or alternative lineages (Box 1). We begin by reviewing the roles of SOX2 in regulating chromatin dynamics and priming genes for future activation during cell fate specification. We then discuss how SOX2 function is influenced by partner factor availability, as well as the level of SOX2 expression. Finally, we conclude by discussing the context-specific functions of SOX2 in tumour cells.
In mouse NPCs, the binding pattern of SOX3 overlaps substantially with that of SOX2 (Bergsland et al., 2011), and both SOX1 and SOX3 have a similar capacity as SOX2 to maintain chick NPCs in an undifferentiated state (Bylund et al., 2003). Furthermore, loss-of-function studies indicate that SOX3 can compensate for short hairpin RNA-based depletion of SOX2 in hESCs to maintain these cells in an undifferentiated and proliferative state (Wang et al., 2012). Moreover, SOX1 and SOX3 have the capacity to replace SOX2 functionally in the formation of iPSCs from human fibroblasts (Nakagawa et al., 2008). However, functional redundancy is not only shared between SOX proteins within the same subgroup. For example, similar to SOX2, SOX9 maintains stem cell features in the mouse brain (Scott et al., 2010), and both SOX9 and SOX17 can reduce cell cycle progression by antagonising WNT/β-catenin signalling (Mukherjee et al., 2020; Topol et al., 2009). Furthermore, several SOX proteins have the ability to bind nucleosomal DNA (Zhu et al., 2018), which indicates a shared function as pioneer factors.
There are possible explanations for the redundant functions of different SOX proteins. For instance, SOX proteins share biochemical properties within the HMG domain and can recognise the same core DNA-binding motif (reviewed by Kondoh and Kamachi, 2010), and members of the same SOX group interact with similar partner factors (reviewed by Kamachi and Kondoh, 2013).
SOX2-mediated chromatin regulation
SOX2 is a pioneer transcription factor
TFs control gene transcription by binding specific DNA sequences (known as motifs) and recruiting transcriptional regulators. However, one important characteristic that affects TF activity is the cell type-specific status of chromatin compaction and thus DNA accessibility (Allis and Jenuwein, 2016; Zaret, 2020). A unique feature of stem cells is their relatively relaxed, open chromatin state, which gives TFs the ability to access – and thus activate – several gene expression programmes (Atlasi and Stunnenberg, 2017). As a stem cell differentiates, chromatin accessibility generally decreases, which helps to consolidate lineage-specific gene expression patterns. However, genes that are activated during stem cell differentiation are often dependent on the de novo establishment of open chromatin domains (Boller et al., 2018). Thus, during stem cell differentiation, the induction of cell type-specific gene expression relies on modulation of the chromatin profile to make specific loci either more or less accessible to TF binding.
The primary mechanism of chromatin condensation is the organisation of DNA around histone octamers, which form nucleosomes. The nucleosome consists of a 147 base pair (bp) stretch of DNA wound nearly twice around an octamer of the four core histones H2A, H2B, H3 and H4 (reviewed by Kornberg and Lorch, 1999) (Fig. 2A). Although this configuration of the DNA provides a steric hindrance to the binding of most TFs, so-called ‘pioneer’ TFs possess the unique capacity to bind nucleosome-enriched chromatin (Box 2), which ‘loosens’ the chromatin, helping to expose silenced genes and permit their transcriptional activation (Iwafuchi-Doi and Zaret, 2014).
Besides its ability to bind nucleosomal DNA, SOX2 can synergise with additional factors that can alter histone modifications, nucleosome positioning and chromatin structure (Ho and Crabtree, 2010; Zaret and Mango, 2016). For example, in mouse NPCs, SOX2 interacts with components of the SWI/SNF complex (Engelen et al., 2011), a key regulator of chromatin accessibility that shifts the position of histone octamers along DNA in an ATP-dependent manner (Fig. 2D) (Whitehouse et al., 1999). Although it is not fully understood how the interaction between SOX2 and the SWI/SNF complex leads to the establishment of open chromatin, SOX10 utilises a similar interaction to regulate Schwann cell differentiation (Stolt and Wegner, 2016). A crucial step in this process is the induction of the TF Krox20 (also known as Egr2) (Ghislain and Charnay, 2006), which SOX10 activates by recruiting the SWI/SNF subunit BAF60a (SMARCD1) (Weider et al., 2012). Subsequently, SOX10 and KROX20 synergise with the catalytic component of the SWI/SNF complex, BRG1 (SMARCA4) (Marathe et al., 2013), as well as components of the NuRD chromatin remodelling complex, to activate the expression of myelin genes (Stolt and Wegner, 2016). Hence, the ability to synergise with partner factors allows SOX TFs to induce localised alterations in chromatin structure.
SOX2 has several features typical of a pioneer TF. For example, the enrichment of SOX2-binding motifs in accessible chromatin regions is robustly associated with SOX2 expression in several different human cell lines, leading to the proposition that SOX2 is involved in the establishment and maintenance of open chromatin (Lamparter et al., 2017) (Fig. 2B). In addition, in the early zebrafish embryo, maternally deposited SOX19b induces local changes in chromatin accessibility that are necessary for activation of zygotic gene expression (Lee et al., 2013; Pálfy et al., 2017; Gao et al., 2022). Finally, the function of SOX2 in the generation of iPSCs is associated with the defining ability of pioneering factors: to target DNase-I-insensitive (i.e. closed chromatin) DNA regions (Soufi et al., 2012).
SOX2 targets
The binding preferences of SOX2 to human nucleosome-enriched chromatin have been tested to examine the molecular underpinnings of the capacity of SOX2 to act as a pioneer factor (Soufi et al., 2015). Whereas SOX2 primarily targets a canonical SOX motif [(A/T)TTGT] at nucleosome-depleted DNA regions, SOX2 targets a degenerated core motif [(A/T)TTNT] at nucleosome-enriched DNA regions. When SOX2 binds to its canonical motif it normally results in a sharp bend of naked DNA (Reményi et al., 2003), but the absence of guanine in the degenerated motif greatly reduces SOX2-dependent DNA bending (Scaffidi and Bianchi, 2001). One hypothesis is that reduced DNA bending is better accommodated on the nucleosome surface (Zaret and Mango, 2016). At the same time, nucleosome-bound DNA is naturally arranged in a smooth ‘pre-bend’ conformation, which widens the minor groove and thus favours SOX2 binding (Soufi et al., 2015).
By developing ‘nucleosome consecutive affinity purification with systematic evolution of ligands by exponential enrichment’ (referred to as NCAP-SELEX), Zhu and colleagues have explored the binding specificity of 220 different TFs to a library of randomised DNA ligands, reconstituted either with nucleosomes or in a nucleosome-depleted state (Zhu et al., 2018). Although most of the examined TFs only bind to free DNA, other TFs, including SOX2, are capable of binding to nucleosomal DNA (Zhu et al., 2018). Single-molecule fluorescence microscopy experiments have also demonstrated the ability of full-length human SOX2 to target nucleosomes in vitro, with a binding preference to sites located at the dyad region of the histones (Li et al., 2019). The dyad region of the nucleosome contains a stretch of a single DNA helix, in which the central region is demarcated by the dyad axis (position 0) and a minor groove that is oriented away from the histone surface (Fig. 2A) (Cutter and Hayes, 2015). SOX2 binds to the dyad region more stably than to naked DNA, which could suggest that SOX2 also establishes contacts with histones (Li et al., 2019). However, cryo-electron microscopy and SOX-DNA crystal structure analyses have shown that the HMG domain of human SOX2 binds to the minor groove, two DNA-helical turns away from the dyad axis (super-helix location +2) (Fig. 2A) (Dodonova et al., 2020). Although these findings are based on different experimental methods, it is also possible that the binding preference to nucleosomal DNA differs between the SOX2 HMG domain and full-length SOX2.
SOX2 primes genes for activation
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis has been instrumental for our understanding of SOX2 function by allowing genome-wide identification of DNA regions bound by SOX2 (see Table 1 for references). These regions include distal enhancer regions, referred to as cis-regulatory modules (CRMs), and proximal promoter regions. ChIP-seq analyses in several different cell types have shown that SOX2, similar to other pioneer factors and regulators of cell type-specific gene expression patterns, preferentially bind to CRMs as opposed to promoter regions (Bergsland et al., 2011; Chen et al., 2008; Hagey et al., 2016). Furthermore, chromatin interaction analysis by paired-end tag sequencing (ChIA-PETS) in wild-type and SOX2 mutant mouse brains has demonstrated that RNA polymerase II (POL II)-bound promoters are enriched for CRMs targeted by SOX2. Indeed, loss-of-function experiments have shown a decrease in POL II-mediated long-range interactions, and reduced gene expression (Bertolini et al., 2019). Together, these data indicate that SOX2 is involved in establishing CRM-promoter interactions and thereby regulates global transcriptional profiles. Indeed, it has been well documented in mouse (m) and human (h) embryonic stem cells (ESCs) that SOX2 targets actively transcribed genes involved in regulating stem cell maintenance and proliferation (Boyer et al., 2005; Chen et al., 2008). However, SOX2 also targets a significant number of silent genes that are activated following ESC specification and differentiation (Bergsland et al., 2011; Boyer et al., 2005; Lee et al., 2006) (Fig. 3).
Many of these so-called ‘pre-bound genes’ are marked by bivalent histone modifications, representing both transcriptionally active (e.g. H3K4me3) (Barski et al., 2007; Bernstein et al., 2005) and transcriptionally silent (e.g. H3K27me3) (Cao et al., 2002; Czermin et al., 2002; Müller et al., 2002) methylation states, which resolve into active or silent histone domains as the genes become expressed or repressed, respectively, during lineage specification (Bergsland et al., 2011; Bernstein et al., 2006; Mikkelsen et al., 2007) (Fig. 3).
It is possible that the pre-binding of silent, lineage-specific genes helps to maintain their CRMs in an open conformation, because the chromatin generally becomes less accessible during ESC differentiation. Consistent with this idea, hESCs differentiating as a result of small interfering (si)RNA-mediated knockdown of SOX2 fail to upregulate germ layer-specific gene expression and instead form trophectoderm (Fong et al., 2008). Notably, genes that are pre-bound by SOX2 in ESCs may also rely on the subsequent expression of SOX TFs for their activation in specific lineages. For example, during the differentiation of mESCs into B cells, the SOX2 pre-bound λ5-VpreB1 locus is first activated after SOX2 expression is downregulated and SOX2 binding has been replaced by SOX4 (Liber et al., 2010). Moreover, a comparison of genome-wide binding data of SOX2 in mESCs and SOX3 in mESC-derived neural progenitor cells (NPCs) has revealed that SOX2 pre-binds many neural genes in ESCs that are subsequently bound and activated by SOX3, and possibly by SOX2 and SOX1, in NPCs (Bergsland et al., 2011). However, SOX-B1 TFs also pre-bind an additional set of silent neuronal and glial genes in NPCs. These genes are later activated as SOX-B1 TFs are downregulated and their binding has been replaced by SOX11 in differentiating neurons (Bergsland et al., 2011) (Fig. 3) and by the SOX-E proteins SOX9 and SOX10 in differentiating glia (Klum et al., 2018).
Consistent with the finding that SOX2 pre-binding is associated with bivalent histone modifications in ESCs, siRNA-mediated knockdown of SOX2 in mESCs strongly reduces H3K4 methylations around SOX2 pre-bound loci (Liber et al., 2010), indicating that SOX2 is involved in the establishment of histone modifications associated with active chromatin. As such, SOX2 directly interacts with the ASH2L subunit of the methyltransferase SET/MLL complex, which mediates H3K4 trimethylation (Yang et al., 2015). Apart from these findings, the specific role of SOX2 in pre-binding for later gene activation is not fully clear.
During cell division, chromatin undergoes significant changes, such as increased compaction, as well as decreased association with transcriptional regulators and decreased levels of RNA synthesis. To maintain lineage decisions and facilitate rapid re-establishment of gene expression in daughter cells, dividing cells utilise a mechanism termed mitotic bookmarking (reviewed by Palozola et al., 2019). The concept of mitotic bookmarking by TFs emerged when it was demonstrated that the promoter regions of some genes were targeted by general TFs in human cell lines to prevent their compaction during mitosis and to efficiently allow their reactivation following mitosis (Martínez-Balbás et al., 1995; Xing et al., 2005). More recent studies in human cell lines have found that many genes are expressed at low levels during mitosis, despite chromatin condensation (Palozola et al., 2017), and that several TFs that primarily regulate gene activity through distal CRMs are also associated with chromatin in mitotic cells. For example, western blot analysis, immunofluorescence experiments and live-cell imaging have demonstrated that SOX2 is associated with mitotic chromatin in mESCs (Teves et al., 2016; Liu et al., 2017; Deluz et al., 2016; Festuccia et al., 2019). Furthermore, cell-cycle-phase-specific degradation of SOX2 protein in mESCs has revealed that SOX2 is necessary at the M-G1 phase transition to maintain cells in a pluripotent state or to promote their commitment to the neuroectodermal lineage development upon differentiation (Deluz et al., 2016).
However, although ChIP-seq experiments have found SOX2 at specific binding sites in mitotic chromatin (of which a subset are also bound by SOX2 at non-mitotic cell cycle stages; Deluz et al., 2016; Liu et al., 2017) another study reports no or very little support for mitotic bookmarking activities by SOX2 (Festuccia et al., 2019). Moreover, forced degradation of SOX2 at the M-G1 phase transition in mESCs does not alter the onset of genes reported to be bookmarked by SOX2 (Deluz et al., 2017). Thus, although the function of SOX2 protein associated with mitotic chromatin is not fully apparent, it is possible that it may increase the local concentration of SOX2 close to its genetic targets, and thereby facilitate its binding to appropriate targets in early G1 phase (Festuccia et al., 2019).
SOX2 partner factors
ChIP-seq experiments have revealed a striking divergence in the binding pattern of SOX2 in different cell populations (Hagey et al., 2016, 2018; Klum et al., 2018; Lodato et al., 2013; Sarkar et al., 2016). For example, only a minority of the regions targeted by SOX2 in mESCs are also robustly bound by SOX2 in primed mouse epiblast stem cells (∼10% overlap) (Matsuda et al., 2017) or in mouse NPCs (∼9% overlap) (Hagey et al., 2016; Lodato et al., 2013). Similarly, a very low overlap in SOX2 binding sites have been found between mESCs and mouse gastric progenitors (Sarkar et al., 2016). Furthermore, as demonstrated in spatially separated cells within the same germ layer, such as the embryonic spinal cord and cortex or stomach and lung, the overlap in SOX2-binding regions can be quite low (e.g.<50%), even in highly related cell types (Hagey et al., 2016, 2018) (Table 1). Although constraints in chromatin accessibility are likely to impact the binding pattern of SOX2, DNase-I-hypersensitivity experiments in NPCs from the embryonic cortex and spinal cord have shown that most of the DNA regions specifically targeted by SOX2 in the cortex are also accessible in the spinal cord (Hagey et al., 2016). Instead, the feature that best explains the target specificity of SOX2 binding in different cell types is the availability of cooperating partner factors (reviewed by Kondoh and Kamachi, 2010) (Fig. 2C).
SOX2 complexes in pluripotent cells
Fundamental insights into the cooperation between SOX2 and partner factors have been made by examining identified CRMs in reporter gene transactivation assays and through protein-binding studies in vitro. Over 25 years ago, SOX2 was shown to form a ternary complex with OCT4 on a CRM of the Fgf4 gene (Yuan et al., 1995), which is required for the specification of primitive endoderm in pluripotent epiblast cells within the inner cell mass of the pre-implantation stage embryo (Niswander and Martin, 1992; Kang et al., 2013; Krawchuk et al., 2013).
To examine how the reprogramming TFs target nucleosomal DNA, the binding of ectopic SOX2, OCT4, KLF4 and MYC has been measured at early stages of iPSC formation in human fibroblasts in vitro. Collectively, these studies demonstrate that SOX2 can independently target heterochromatin and allow MYC to bind nucleosomal DNA both in vitro and in vivo, which in turn enhances the binding of SOX2 to chromatin (Soufi et al., 2012; 2015). Thus, the ability of SOX2 to bind silent, nucleosome-enriched DNA and function as a lead factor that assists the binding of additional factors is consistent with its pioneer function (Iwafuchi-Doi and Zaret, 2014) (Fig. 2C). The cooperative binding of SOX2 and partner factors to DNA has also been examined in live cells using single-cell and single-molecule imaging strategies. Through this experimental approach, it has been demonstrated that SOX2 and OCT4 target their DNA-binding sites in a hierarchically ordered fashion, although which TF acts first is disputed (Chen et al., 2014; Li et al., 2019). Based on the finding that SOX2 makes fewer non-specific binding attempts to DNA in the absence of its transactivation domains, it has been suggested that TFs with fewer complex transactivation domains have an advantage as lead factors over TFs with transactivation domains with several high-affinity surfaces. According to this model, a partner factor, such as OCT4, with a more complex transactivation domain is better suited to execute multiple interactions with co-activators (Chen et al., 2014).
In addition to SOX2 and OCT4, pluripotent stem cells depend on the activity of the TF NANOG (Chambers et al., 2003; Mitsui et al., 2003). Genome-wide binding analyses in ESCs have demonstrated that SOX2 targets most of its binding sites together with both OCT4 and NANOG (Boyer et al., 2005; Chen et al., 2008). These TFs can physically interact, albeit with different affinities, which appears to be necessary for their cooperative binding to DNA and for their capacity to induce and maintain pluripotent stem cell features (Ambrosetti et al., 1997; van den Berg et al., 2010; Gagliardi et al., 2013; Ivanova et al., 2006; Masui et al., 2007; Takahashi and Yamanaka, 2006).
Cell type-specific SOX2 complexes
Apart from pluripotent stem cells, SOX2 is also expressed in a range of embryonic and adult stem and progenitor cell populations, in which it has been ascribed several functions. Consistent with the diversity of these processes, immunoprecipitation experiments, ChIP-seq data sets and functional studies have identified an array of SOX2 partner factors (Engelen et al., 2011; Hagey et al., 2018). In addition to chromatin remodelling and histone-modifying proteins (Box 2; Fig. 2D) (Engelen et al., 2011), SOX2 interacts with members of multiple TF families, including the FOX, TCF, homeobox, PAX, POU and RFX families (Bergsland et al., 2011; Hagey and Muhr, 2014; Hagey et al., 2018; Kondoh and Kamachi, 2010; Lodato et al., 2013; Oosterveen et al., 2012; Peterson et al., 2012).
To address the molecular mechanisms underlying these functional associations between SOX2 and its partner factors, structural models of ternary complexes between CRMs and the DNA-binding domains of SOX2, OCT4 or PAX6 have been resolved (Reményi et al., 2003). Importantly, the distance between the binding sites of SOX2 and those of OCT4 or PAX6 within CRMs, results in distinct conformational arrangements, which are necessary for the function of the resulting ternary complexes. For example, SOX2 and OCT4 assemble into distinct complexes on CRMs of Fgf4 and Utf1, in which the spacing between the SOX2 and OCT4 binding sites differ, which is of importance for their ability to attract transcriptional co-regulators (Reményi et al., 2003).
Furthermore, SOX2 and its SOX-B1 homologues have been implicated in translating several signalling pathways, such as those of sonic hedgehog (SHH), bone morphogenetic protein (BMP), WNT and retinoic acid (RA), into tissue-specific transcriptional outputs (discussed below). For example, cellular patterning in the developing spinal cord crucially depends on GLI TFs acting downstream of graded SHH signalling. CRMs of neural-patterning genes downstream of SHH signalling are synergistically induced by GLI and SOX-B1 proteins, which act via juxtaposed binding sites (Oosterveen et al., 2012; Peterson et al., 2012). Moreover, by misexpressing SOX-B1 TFs in the developing chick limb bud, activators of the SHH pathway (i.e. SmoM2, a constitutively active form of smoothened) can induce ectopic expression of neural-specific target genes in mesodermal cells (Oosterveen et al., 2013). Thus, despite the ubiquitous expression of SOX-B1 TFs in NPCs, interactions with spatially restricted partner factors enable them to induce neural gene expression in a cell type-specific fashion.
Another important SOX2-interacting pathway is canonical WNT signalling. WNT ligands bind to their cognate cell surface receptors (such as FZD and LRP5/6), thereby stabilising non-cytoskeletal β-catenin, which accumulates in the nucleus where it acts as a co-activator for TCF/LEF TFs. Interestingly, SOX2 can interact with TCF/LEF in co-immunoprecipitation experiments (Hagey and Muhr, 2014). Moreover, ChIP-seq analyses of mouse NPCs from embryonic cortices or mouse gastric progenitors have shown that TCF/LEF motifs are significantly enriched in SOX2-bound regions (Hagey and Muhr, 2014; Sarkar et al., 2016). A study in cortical mouse NPCs has found that SOX2 represses genes that are enriched for SOX and TCF/LEF binding motifs separated by 5-9 bp. Conversely, this configuration of SOX and TCF/LEF motifs is under-represented in SOX2-bound CRMs that are associated with genes activated by SOX2 (Fig. 4A) (Hagey and Muhr, 2014). The spacing of 5-9 bp between SOX and TCF/LEF binding motifs suggests that these TFs can bind on opposing sides of the DNA helix (around genes repressed by SOX2) and that this specific binding configuration results in a ternary complex that facilitates the recruitment of GRO/TLE co-repressors. Indeed, co-immunoprecipitation experiments in vitro, together with transactivation experiments in chick embryos, have revealed that the interaction between SOX2 and TCF/LEF TFs on the Ccnd1 promoter promotes the recruitment of GRO/TLE co-repressors and thereby suppresses Ccnd1 expression (Fig. 4B) (Hagey and Muhr, 2014). Thus, SOX2 can counteract WNT-mediated activation of cyclin D1 and thereby repress NPC proliferation (Fig. 4B). Importantly, SOX2 and SOX3 can also interact directly with β-catenin (Zorn et al., 1999; Hagey and Muhr, 2014; Mansukhani et al., 2005). Through this interaction, SOX2 has been proposed to attenuate WNT-mediated gene activation in osteoblasts (Ambrosetti et al., 2008; Seo et al., 2011). However, whether SOX2 can also contribute to the activation of WNT-regulated gene expression under certain contexts is currently unknown. Evidence from SOX17 in gastrulating Xenopus embryos may provide some mechanistic insight, because SOX17 can also repress TCF- and β-catenin-mediated gene activation to promote endodermal gene expression over a mesectodermal fate (Mukherjee et al., 2020; Zorn et al., 1999). Furthermore, SOX17 and β-catenin can co-occupy CRMs to synergistically activate gene transcription in a TCF-independent manner (Mukherjee et al., 2020).
Level-dependent function of SOX2
Examples from hypomorph mutations
Another means of regulating TF activity is by modulating protein levels. Abnormalities linked to decreased SOX2 levels have been reported in humans, in which hypomorphic mutations of the SOX2 gene can lead to ocular defects, such as anophthalmia and microphthalmia, which manifest as absent or abnormally small eyes, as well as oesophageal atresia, in which the oesophagus fails to reach the stomach, and tracheoesophageal fistula, in which the trachea and oesophagus fail to separate properly (Fantes et al., 2003; Hagstrom et al., 2005; Williamson et al., 2006). Similar phenotypes have been reported in mice in which the level of SOX2 expression has been systematically decreased through a series of SOX2 mutations, including null and hypomorphic alleles (Taranova et al., 2006). For example, reduction of SOX2 in retinal progenitors leads to defects in their capacity to self-renew and differentiate, with a reduction of SOX2 levels to less than 40% of wild type causing microphthalmia (Taranova et al., 2006). Moreover, a similar genetic strategy has revealed a dose-dependent role of SOX2 in the differentiation and morphogenesis of foregut-derived organs, whereby decreasing levels of SOX2 in the oesophagus correlates with the development of tracheoesophageal fistula (Que et al., 2007) The absence of SOX2 in the mouse foregut results in a complete transformation of the oesophagus into the trachea (Trisno et al., 2018; Teramoto et al., 2020). Thus, SOX2 dosage is vital for the proper specification and differentiation of retinal progenitors and anterior foregut endoderm.
Regulation cell proliferation
SOX2 protein levels can also influence the rate of proliferation of stem and progenitor cells. For example, Müller glia cells constitute a quiescent stem cell population in the adult retina, and the removal of Sox2 in mice promotes cell cycle re-entry and mitotic activation (Surzenko et al., 2013). Moreover, in the ventricular zone of the developing mouse cortex, SOX2 is more highly expressed in the slowly dividing, multipotent, radial glia cells than in the rapidly dividing intermediate progenitor cells, which are committed to neurogenesis (Hagey and Muhr, 2014; Hutton and Pevny, 2011). Gain- and loss-of-function experiments in mouse embryonic cortices, spinal cord and stomach have demonstrated that increasing levels of SOX2 slows cell cycle progression whereas reducing SOX2 levels increases the proliferation rate of progenitors (Hagey and Muhr, 2014).
These findings raise the question of how differences in SOX2 expression levels are interpreted and result in distinct cellular responses. As we have discussed, SOX2 represses the pro-proliferative gene Ccnd1 (Hagey and Muhr, 2014) in a dose-dependent manner. This activity is dependent on low-affinity binding sites that contain one or more base alterations from the consensus binding motif and are less efficiently bound by SOX2 in ChIP-seq and gel-shift assays (Hagey and Muhr, 2014). Hence, it is tempting to speculate that the dose-dependent function of SOX2 can be explained by a simple affinity model, whereby the occupancy of SOX motifs in CRMs is determined by their affinity to SOX2 and the amount of active SOX2 protein present (Fig. 4B).
Conclusion
Recent methodological advancements have made it possible to characterise the function of SOX2 at a genome-wide scale. The insights gained from these experiments have greatly improved our mechanistic understanding of how SOX2 regulates cell behaviour. Rather than focusing on specific downstream targets, these approaches have enabled a full appreciation of SOX2-regulated genes. In addition, these methods have allowed us to incorporate SOX2 into a global network of binding partners, epigenetic modifications and chromatin structures, which ultimately result in the regulation of gene expression.
As described in this Primer, SOX2 has key roles in the generation of iPSCs and in maintaining homeostasis of various organ-specific stem cell populations. The important role of SOX2 in regulating stemness and stem cell differentiation is consistent with its function as a pioneer factor, but also with its capacity to pre-bind silent genes and thereby maintain their associated CRMs in an open conformation. Although binding of SOX2 to nucleosomes has been shown to partly dissociate the nucleosomes, further studies are needed to understand the downstream molecular hierarchy that leads to the efficient establishment of open chromatin and, potentially, gene activation. Furthermore, SOX2 is necessary for the development and growth of various types of tumours but can also supress tumour growth, which has been exemplified by a mouse gastric cancer model (Sarkar et al., 2016). Although it is tempting to speculate that these diverse functions in tumorigenesis reflect the capacity of SOX2 to regulate stem cell features and suppress WNT/β-catenin-driven cell proliferation, more experiments are needed to understand the context-dependent role of SOX2 as an oncogene and tumour suppressor (Box 3).
A general role of SOX2 is to regulate stem cell maintenance and self-renewal. ChIP-seq experiments in a variety of pluripotent, neural and endodermal cells have revealed that the common SOX2 binding sites in these cells are mostly enriched around genes involved in stem cell division (Hagey et al., 2018). As such, SOX2 is also necessary for the proliferation and maintenance of diverse types of tumour cells and is thus proposed to function as an oncogene (Bass et al., 2009). For example, SOX2 is frequently amplified in lung and oesophageal cancers (Bass et al., 2009; Rudin et al., 2012), and loss-of-function experiments have demonstrated that SOX2 is necessary for the development of glioblastoma, Rb deficiency-induced pituitary tumours and squamous cell carcinoma (Boumahdi et al., 2014; Favaro et al., 2014; Kareta et al., 2015). However, as in healthy stem cells, SOX2 expression levels can affect cancer cell proliferation and SOX2 can function as a context-dependent tumour suppressor. Although some SOX2 expression is necessary to endow mouse glioma and medulloblastoma precursor cells with the competence to self-renew (Ahlfeld et al., 2013; Favaro et al., 2014), high levels of SOX2 reduce their proliferation rates (Cox et al., 2012). Similarly, in the mouse lung, KRAS-induced adenocarcinoma formation is strongly enhanced in a SOX2 heterozygote background (Xu et al., 2014). Moreover, SOX2 overexpression can suppress cell proliferation of human gastric cancer cell lines (Otsubo et al., 2008), and the loss of SOX2 enhances tumour growth in a mouse model of WNT-driven gastric cancer (Sarkar et al., 2016). Together, these findings suggest that SOX2 may function in context-dependent manner to either repress or promote tumour growth.
In conclusion, SOX2 has diverse regulatory functions in both healthy and cancerous stem cell populations. Further work on how SOX2 achieves its different functions in a context-dependent fashion could have important implications for developmental biology, regenerative medicine and our understanding of stem cell-associated diseases.
Acknowledgements
We would like to thank the members of the Muhr lab for helpful comments and discussions. We apologise to colleagues whose work we could not cite owing to space limitations. Illustrations were made Mattias Karlen.
Footnotes
Funding
This research was supported by grants from the Swedish Research Council (Vetenskapsrådet; 2021-03083 to J.M.), The Swedish Cancer Foundation (211741PJ01H to J.M.), The Swedish Child Cancer Foundation (Barncancerfonden PR2019-0148 to J.M.), The Swedish Brain Foundation (Hjärnfonden FO2021-0315 to J.M.) and Cancerfonden.
References
Competing interests
The authors declare no competing or financial interests.