Distributed social learning may occur at many temporal and spatial scales, but it rarely adds up to a stable culture. Cultures vary in stability and diversity (polymorphism), ranging from chaotic or drifting cultures, through cumulative polymorphic cultures, to stable monolithic cultures with high conformity levels. What features can sustain polymorphism, preventing cultures from collapsing into either chaotic or highly conforming states? We investigate this question by integrating studies across two quite separate disciplines: the emergence of song cultures in birds, and the spread of public opinion and social conventions in humans. In songbirds, the learning process has been studied in great detail, while in human studies the structure of social networks has been experimentally manipulated on large scales. In both cases, the manner in which communication signals are compressed and filtered – either during learning or while traveling through the social network – can affect culture polymorphism and stability. We suggest a simple mechanism of a shifting balance between converging and diverging social forces to explain these effects. Understanding social forces that shape cultural evolution might be useful for designing agile communication systems, which are stable and polymorphic enough to promote gradual changes in institutional behavior.

Social learning and diverse (polymorphic) cultures are two cornerstones of human civilization. However, culture is not unique to humans (Laland and Hoppitt, 2003): distributed social learning can also give rise to the accumulation of shared behaviors in groups of dolphins (Hassler and Hogarth, 1977; Reiss, 2011), monkeys (van de Waal et al., 2013) and songbirds (Fehér et al., 2009). Cultures evolve in three stages: innovation (often by a single animal), transmission through a social network, and modifications (at the population level) (Henrich, 2001). Cultures vary in their stability and richness: at one extreme, culture can quickly evolve into a stable monolithic state, with high conformity across individuals (Fig. 1A). This has been demonstrated experimentally in wild vervet monkeys: when a group was presented with two food sources, edible red corn and bitter-tasting pink corn, the group quickly developed a ‘culture’ of avoiding the pink corn and sustained it for several months after the bitter taste had been removed from the pink corn. Young individuals that migrated into that group, from a group of monkeys that had received the opposite treatment (and avoided red corn), quickly switched to consuming red corn and avoiding pink corn. Those young individuals never tasted bitter corn, and yet they adopted the group's norm with high levels of conformity (van de Waal et al., 2013).

At the other extreme, one may imagine unstable or chaotic cultures. Highly chaotic cultures are probably common, but difficult to study. A milder form of cultural instability is a drifting culture (Fig. 1C). Drifting cultures have been observed in whale songs (Garland et al., 2011): male humpback whales produce highly stereotyped, repetitive songs. All males within a population conform to a certain song type, but not for very long. Song types spread rapidly, like cultural ripples, over thousands of miles, resulting in a strong annual fluctuation in songs produced within a region.

Perhaps the most fascinating cultures are between those two extremes (Fig. 1B): cultures that exhibit many shared behavioral components (polymorphism), but also stability, even over hundreds of generations, despite the constant flow and spreading of multiple behavioral patterns. For example, in white-crowned sparrows (Zonotrichia leucophrys) (Marler and Tamura, 1962), juveniles acquire their songs by imitating songs from their neighbors, which collectively lead to the establishment of local song dialects. When listening to a white-crowned sparrow singing, it is often possible to recognize both the dialect and the individual bird. This is because each dialect has several distinct features, or a shared syllable vocabulary. For example, in a certain forest, songs may typically begin with a few down-sweeps and include a long ‘buzz’, whereas in a nearby forest, songs may start with a single down-sweep and include a prolonged pure tone. An individual bird might produce only a sub-set of the shared vocabulary and therefore have a unique song. However, collectively, the vocabulary remains stable at the population level over decades (Garcia et al., 2015). Local song dialects can have an important role in courtship and in territorial behaviors. Evidence from a few birdsong species suggests that females prefer males whose song includes syllables from the shared vocabulary (Maney et al., 2003; O'Loghlen and Rothstein, 1995). Males can also signal different levels of aggression by matching the song types of their rivals to various extents during territorial disputes (Akçay et al., 2013; Stoddard et al., 1992). The song culture is therefore an accumulation of shared behavioral patterns, which are acquired through social learning. Local dialects allow singing behavior to communicate both individual identity and group identity, so that birds can distinguish between local and foreign individuals (Mammen and Nowicki, 1981), and resolve territorial disputes without fighting (Akçay et al., 2013). Note that these functions require the retention of stability and polymorphism in the shared vocabulary: the stability facilitates retention of group identity, while polymorphism provides sufficient ‘bandwidth’ for signaling individual identity (Mundinger, 1970) and for communicating different levels of aggression by varying the degree of song matching.

What features can sustain stable polymorphic cultures and prevent them from collapsing into either chaotic or highly conforming states? One may start by asking, more generally, what sort of interactions between elements would accumulate to create a stable and complex structure? In the physical world, for example, macro structures are often formed by the balance between different forces: some promote cohesion by attracting particles over long distances, whereas others prevent implosion by repelling them over short distances (Badii and Politi, 1999). In a similar vein, we suggest that the combined influences of converging and diverging social forces may either promote or infringe stable polymorphism at the macro level of culture (Fig. 1D).

The accumulation of social interactions into culture can be studied at two levels: at the level of dyadic interactions between individuals who influence each other or learn from each other, and at the level of signal propagation through the social network. We will review these levels by integrating studies across two model systems: the emergence of song culture in birds, and the spread of public opinion and social conventions in humans. In songbirds, we will focus on how vocal learning (through dyadic interaction) shapes culture. Studying this process in songbirds has two important advantages: first, birds can be kept socially isolated and their developmental experience can be fully controlled (Tchernichovski et al., 2004). Second, the songbird brain is highly accessible for experimental manipulations, allowing mechanistic investigation of social and vocal coordination (Benichov et al., 2015). As discussed above, convergence through learning and interaction may result in stable polymorphic vocal cultures in songbirds. Similar social processes have been studied in humans: convergence has been shown in natural dialog (Levelt and Kelter, 1982), and experimentally demonstrated in both linguistic (Brennan and Clark, 1996; Pickering and Garrod, 2006) and non-linguistic (Galantucci, 2005; Garrod et al., 2007) communication. However, only recently have studies started to explore how mechanisms of convergence may scale up to the level of stable cultures (Centola and Baronchelli, 2015). In this Review, in order to complement the mechanistic strength of birdsong research, we focus on integrating song learning studies in birds with human studies at the social network level, where controlled experiments show how the structure (topology) of social networks can shape public opinion and social conventions.

Social influence and peer learning are widespread, but song culture is the rare example where acquired behaviors can accumulate over decades (Garcia et al., 2015; Marler and Tamura, 1962). Although song learning is ubiquitous in songbirds, local song dialects were detected in only a few species (Podos and Warren, 2007). Song imitation can be more or less accurate depending on genetic, ecological and social factors: song similarity between birds decreases with genetic distance and high genetic flow between populations may therefore promote polymorphism in song structure within a group (MacDougall-Shackleton and MacDougall-Shackleton, 2001). Ecological factors can further contribute to polymorphism. For example, juvenile birds that are subject to even a mild nutritional stress during song learning often fail to accurately imitate the song of their adult tutor (Nowicki et al., 2002). Finally, at the social level song learning may be shaped by interactions with peers, such as affiliative and aggressive interactions between siblings (Derégnaucourt and Gahr, 2013) and female guidance during song development (West and King, 1988).

We do not know how convergence (due to learning) and divergence (due to the accumulation of song-copying errors; Lachlan et al., 2016) add up to explain how local cultures are formed. Only a handful of multigenerational studies have documented the evolution of local song dialects over time. One such example is in the saddleback (Philesturnus carunculatus), a semi-flightless songbird. The males sing highly diverse songs, which are shared within a group of up to 20 individuals occupying contiguous territories. New song groups (local dialects) are thought to emerge from errors in song learning (Jenkins, 1978). But when the rate of errors in song learning is too high, song dialects may fail to emerge and stabilize. For example, the high frequency of song copying ‘errors’ in zebra finches can explain why different domesticated colonies show only weak local dialects (Lachlan et al., 2016). But interestingly, even in cases where diverging forces are strong enough to prevent the establishment of local song dialects, birds still maintain their species-specific song features. What sort of ‘long distance’ converging forces can account for this?

Songbirds are capable of imitating a broad range of vocalizations, including the songs of other species, even though most species are unlikely to do so in the wild (Soha and Marler, 2000). Given the broad range of vocalizations that songbirds can learn to produce, song-copying errors and improvisations should accumulate over time and with geographic distance. Therefore, one would expect song dialects to diverge with geographical distance without limits. However, Marler (Marler and Nelson, 1992; Marler and Pickert, 1984) observed that song cultures diverge only over short geographical distances, and converge over very large distances, even across continents. Since production constraints could not explain the global convergence toward species-specific song cultures, he suggested that perceptual biases, e.g. in female song preference, could stabilize species-specific song features via sexual selection. But it appears that, in some respects, species-specific song culture can be explained by biases in song learning at the individual level (Fehér et al., 2009). Here, we focus on three features of song learning that can potentially explain how a stable and polymorphic song culture can be sustained over generations.

Transition from a graded to a categorical signal

Vocal communication signals can be either graded or categorical. Graded signals are characterized by a continuum of broadly distributed features (without clear ‘bumps’, Fig. 2C, left panel). An example of a graded signal is crying behavior in human infants: acoustically, crying is highly complex, and it can transfer important information about urgency and severity. Mothers can often identify the type of distress (hunger versus pain) expressed in their infant's cry (Gustafson and Harris, 1990), but by and large, the signal lies on a continuum (Stewart et al., 2013). On the other hand, categorical signals are characterized by a narrow or highly clustered distribution of features (Fig. 2C, right panel). For example, when zebra finches can see their peers, they tend to produce short calls. When an individual loses sight of its neighbors, it produces a long and loud contact call. In aggressive situations, it produces harsh (hiss-like) calls (Zann, 1996). Each of these call types is acoustically distinct and can be recognized as a distinct cluster in acoustic space. Acoustic variability within each of those call types is also meaningful, forming a rich, graded signal within each category, allowing birds to share a wide variety of behavioral states with their peers (Elie and Theunissen, 2015).

Across many vocal learner species, vocal development begins with a broad range of exploratory sounds called vocal babbling (Doupe and Kuhl, 1999; Knörnschild, 2014; Oller et al., 2008). In many songbirds, vocal babbling is characterized by graded signals, which develop into highly stereotyped syllable types found in adult song (Fig. 2A,B). We suspect that the developmental transition to categorical signal tends to be weaker in vocal-learning mammals. Even in human, where there is a clear developmental transition from vocal babbling to categorical speech, acoustically the signal remains surprisingly variable (Oller et al., 2013), which is why automatic speech recognition is so difficult. Songbirds are therefore unique in their strong developmental transition from highly variable to highly stereotyped vocalization (Fig. 2A,B). This transition has been studied extensively at both behavioral and neuronal levels (Aronov et al., 2008; Lipkind and Tchernichovski, 2011), and we suggest that it facilitates both vocal learning and song culture: the wide and continuous range of early vocal babbling is optimal for vocal exploration, namely for matching the ‘sensory templates’ of song syllables produced by an adult bird ‘tutor’. As song imitation progresses during development, distinct syllable types (clusters) emerge and differentiate (Fig. 2D). Although the sensory input is highly stereotyped and categorical in birdsong, the emergence of clusters in the developing songs takes place even in the absence of categorical sensory input. Song development is delayed in socially isolated birds, but even isolate songs eventually stabilize and show distinct syllable types (Morrison and Nottebohm, 1993; Price, 1979). Furthermore, we recently found that providing birds with delayed self-input, namely, training a bird with its own developing song, induces rapid emergence of clusters, similar to birds that were trained with categorical songs (Fehér et al., 2017). Therefore, song imitation can be seen as a modulating factor, rather than the cause of this transition, which is internally driven (Tchernichovski and Marcus, 2014).

The early generation of distinct syllable types has implications at the level of song culture. Cultural transmission of a highly stereotyped signal with distinct categories (or symbols) should be easier, and it is more likely to remain stable over iterations compared with a graded signal. Interestingly, spontaneous emergence of a categorical signal has been reported in language evolution studies (Carr et al., 2016). In these experiments, human subjects were instructed to learn an artificial language, composed of arbitrary words, each representing objects differing in a visual feature (such as color and shape) and in movement. Using the learning outcome of one individual as the training set for the next individual in a transmission chain (in an iterated fashion, as in a telephone game), resulted in rapid emergence of structured languages (Kirby et al., 2008; Scott-Phillips and Kirby, 2010), even when the initial ‘meaning space’ (the mapping from words to objects) was entirely continuous.

The developmental transition from graded to categorical signaling is analogous to signal compression. At the extreme, compression could collapse the entire distribution into a single category. A more useful compression would cluster the signal into several categories. In many songbird species, mature songs are composed of ∼3–10 syllable types. But there are extreme cases: in chipping sparrows the broad distribution of song features collapses into a very simple song including a single (bird-specific) syllable type over the course of development (Liu and Nottebohm, 2007), whereas the songs of an adult California thrasher remain complex and variable (Sasahara et al., 2012). We suspect that a strong compression, resulting in fewer and more stable syllable types, should make song cultures more stable. Podos and Warren (2007) performed a meta-analysis of song dialects across species, which appears to support this notion: in songbirds that learn prior to dispersal (namely in the territory of their parents), song dialects are more common and more stable in species with smaller song repertoires.

Adaptive balance between convergence and divergence

As noted earlier, convergence in singing behavior might be counterbalanced by the accumulation of song-copying errors. But errors in song imitation are not entirely random: the accuracy of song imitation varies with environmental (Nowicki et al., 2002) and social (Chen et al., 2016; Tchernichovski and Nottebohm, 1998) conditions. Furthermore, there is evidence that the accuracy of song imitation may change adaptively – i.e. to counterbalance a strong convergence. For example, adult zebra finches typically produce a highly stereotyped song, including several repetitions of a single motif. Each motif is composed of 2–8 syllable types, produced in a fixed order (Fig. 2B). A juvenile zebra finch, raised singly with an adult male (tutor), will typically acquire a nearly perfect replica of his tutor's song. However, in a family setting, where a few siblings are interacting with a single tutor (their father), typically only one of them (the first one to imitate the father's song) will develop an accurate imitation. In the other siblings, song imitation is partially inhibited, resulting in divergence (Tchernichovski and Nottebohm, 1998). This is not due to lack of opportunity to learn from a busy tutor: a recent study showed that the rate at which tutors produced song is inversely related to pupil attention and to song learning (Chen et al., 2016). Therefore, divergence appears to be an active process. Furthermore, zebra finches accurately imitate song playbacks that they heard for several seconds per day, but imitation accuracy decreases with further exposure to song playback (Tchernichovski et al., 1999).

Fig. 3 presents an outcome of song learning in a social arena, where ten cages with juvenile pupils were arranged around a single adult tutor. As shown in Fig. 3B, only one pupil (P2) acquired an accurate replica of the tutor's song. A group of three pupils (P3, P10 and P6) developed songs that were only partially similar to the tutor. One bird (P4) improvised an entirely new song, consisted of call-like syllables. Another bird (P7) produced a hybrid song, with some syllables copied from his tutor, and other syllables copied from his peers (Derégnaucourt and Gahr, 2013) including from the abnormal ‘call-like’ song of P4. Interestingly, in cases where a tutor song is abnormally ‘monopolized’ by a single syllable type (as in birds P4 and P7 in Fig. 3), pupils imitate the song with a twist: Fig. 4A shows a song of zebra finch tutor who was raised in isolation and has developed an abnormal song. His song was dominated by a single syllable type (syllable B), which was repeated back-to-back and occupied about 80% of the song bout. His pupil copied syllable B, but its abundance decreased to 27%, and the distribution of syllable types in the pupil song became more diverse. A systematic investigation across birds showed that song imitation is sensitive to the abundance of syllable types (Fig. 4B) (Feher et al., 2009; Tchernichovski and Marcus, 2014).

In general, juvenile songbirds tend to copy not only the structure of song syllables, but also the abundance (relative frequency) of each syllable type from their tutor. However, once the abundance of a tutor song syllable is higher than 30%, we see a ceiling effect in the imitation, such that the abundance of syllable types copied from the tutor rarely exceeds 30% (Fehér et al., 2009). This ceiling effect alone can explain why wild zebra finch songs are typically composed of at least three syllable types. It is analogous to a negative (balancing) frequency-dependent selection (Fitzpatrick et al., 2007), which is a specific type of natural selection that can explain the retention of polymorphism in phenotypes (Fig. 4C). In summary, an accurate song imitation is only one aspect of vocal learning in songbirds. Deviations from accurate imitation might reflect an adaptive balance between convergence and divergence. Divergence might be regulated by social inhibition of song imitation, which we observed in cases where a particular song is highly abundant across birds (Fig. 3), and by negative frequency-dependent selection of syllable types that are highly abundant within a song (Fig. 4).

Evidence for active divergence via balancing selection has been documented only in domesticated zebra finch, which do not establish stable song dialects in nature. It is therefore an open question as to whether and to what extent balancing selection plays a role in natural song dialects. This is a difficult problem because tracking social interactions during song development in the wild is extremely challenging.

Directional biases in song learning stabilize feature distribution

We now return to the question that puzzled Marler: what converging forces could prevent song cultures from drifting apart along geographical distances without limits? The mechanism of convergence toward species-typical songs can be studied by tracking song learning across generations, starting from the abnormal song of an isolate founder. Interestingly, convergence toward a wild-type distribution of song features can be detected within three to four generations (Fehér et al., 2009). The imitation of isolate song syllables appears to be fairly complete: namely, almost every song syllable was copied from the isolate tutors (Fig. 4D). However, directional biases in the imitation process can be easily identified. For example, isolate songs often include abnormally long syllables (Fig. 4D, top, red bar), but the copies of such syllables in pupil songs tend to be shorter. Analysis across several birds (Fig. 4E) shows that syllable duration is copied accurately in the range of 30–270 ms, but above this range, the pupils' copies are always of shorter duration. The accumulation of this bias leads, within a few generations, to an upper limit of 270 ms, which is similar to the upper limit of syllable durations of zebra finches in our database. Such biases are analogous to directional signal filtering, which can, over generations, stabilize the distribution of song features within species-specific boundaries. In sum, at least some of the species-specific convergence that Marler observed across song cultures might be explained by directional biases during song learning.

Earlier, we suggested that polymorphic cultures are sustained by a balance between converging and diverging forces (Fig. 1). We have identified such forces in song learning: (1) early developmental transition to categorical signals (clustering), which may ease cultural transmission and promote stability; (2) negative (balancing) frequency-dependent filtering, which may promote cultural polymorphism; and (3) directional filtering, which may sustain stable cultural boundaries. Overall, the process of song learning is more interesting than simply providing a mechanism for transferring information (Rendall et al., 2009): when songs are learned, the signal is compressed and filtered. We will now show that signal compression and filtering may also take place at the macro level, while ‘traveling’ through a social network.

Social structure inevitably impacts the formation and maintenance of song dialects in birds. For example, European starling populations that live in colonies exhibit more complex dialect patterns than those nesting individually, where certain song elements completely lack variation (Snowdon and Hausberger, 1997). However, only a few studies have investigated birdsong at the social network level (Sasahara et al., 2012; Weiss et al., 2014), and linking network structure to natural dialects is challenging. In contrast, social networks in humans have been studied extensively in many fields (Jackson, 2008). The most relevant studies focus on identifying mechanisms that can determine the emergence and retention of social conventions or public opinion. Interestingly, these studies show that the connectivity pattern (topology) of social networks can shape cultural forms, including their stability and polymorphism.

There are many examples of stable but polymorphic social conventions in humans. To name one, ethnic groups often coordinate shared linguistic conventions on accepted names for children and a distinct conventions for naming pets (e.g. in the US, Buddy and Coco are commonly used for naming dogs and parrots, but rarely for naming children) (Ullmann-Margalit, 2015). Like song dialects, the shared vocabulary for naming pets can remain stable over decades despite frequent innovations of alternative options (Centola and Baronchelli, 2015). Semantic coordination can stem from dyadic interactions (Garrod and Anderson, 1987), but shared social conventions may also emerge from centralized authority, social leadership and aggregated information (Kearns et al., 2009; Salganik et al., 2006). Dyadic interactions and central authority both exert an influence on social networks through which information travels, and the structure of the network can affect the saliency of that influence (Dunbar, 2004; Nettle and Dunbar, 1997). For example, dissenters may pay a higher price if they live in a highly clustered social network, which therefore constrains the level of local divergence.

Centola and Baronchelli (2015) recently demonstrated experimentally that the structure of social networks has a crucial role in allowing or preventing the emergence of global conventions from dyadic coordination between individuals. Strikingly, this effect was demonstrated in conditions where the social networks were completely invisible to the subjects. They trained pairs of subjects to coordinate terms by presenting them with images and rewarded them when they managed to simultaneously use the same terminology to describe them. In one experiment, subjects played repeatedly with virtual neighbors, who played with their neighbors and so on, in a so-called ‘spatial social network’ (as in Fig. 5A, a chain-like network). Within a few iterations, many ‘neighbors’ managed to coordinate terms. However, competing conventions across the neighborhoods kept offsetting each other, and a global convention was never achieved. Interestingly, global social convention did emerge and became universally adopted in experiments when subjects were paired homogenously (as in Fig. 5B, in a sparse equidistant arrangement of connections). In summary, the homogenously connected social network acted as a converging force, whereas the spatial network promoted diversity and instability. These results were scale-invariant, namely, the topology but not the size of the social networks determined if a stable culture could emerge or not.

In the Centola and Baronchelli (2015) study, the outcome was either instability or a global consensus with a high conformity level. What network topology, if any, can promote a stable polymorphism? This is of particular importance to political scientists who are interested in the problem of retaining minority opinion ‘alive’ in online debates. A recent experimental study by Klar and Shmargad (2016) examined how under-represented viewpoints can ‘survive’ while traveling through experimentally designed social networks of different topologies. They found that spatial (highly clustered) networks promote consensus, quickly eliminating under-represented viewpoints (Fig. 5C), whereas more homogeneously connected social networks, called ‘small-world networks’ (Watts and Strogatz, 1998) retained minority viewpoints across many iterations (Fig. 5D). Therefore, as opposed to the Centola and Baronchelli (2015) study, here, the spatial network induced consensus, whereas the more homogenously connected network promoted polymorphism.

The discrepancy between these studies is probably due to different constraints on interactions at the dyadic (micro) level: in the social conventions experiment, subjects constantly produced novel names for the pictures presented, whereas in the minority viewpoints experiment, subjects’ choices were binary (between two competing views). In both studies, the spatial network acted as a local filter, eliminating minority viewpoints or rare conventions before those could travel very far. However, with a narrow (binary) space of behaviors, similar local clusters can easily merge, and a global consensus is quickly reached (Fig. 5C). In contrast, with a broad space of behavioral options, clusters of local consensus are highly diverse and are therefore likely to collide rather then to merge, resulting in instability (Fig. 5A). Similarly, in both studies, the more homogenous networks made it easier for rare morphs (e.g. two people holding the same minority viewpoint) to find each other, hence keeping their views alive. But here too, differences in signal bandwidth can lead to different outcomes: in a binary space, a majority is immediately apparent and links between rare morphs can evolve quickly. In a broadband space, all morphs are initially a minority, and a majority evolves slowly but persistently.

Earlier, we presented evidence for signal filtering during dyadic social song learning in birds and here we discuss signal filtering at the social network level in humans. But there are interactions between those two levels. For example, the results of the human studies suggest that the outcome of network-level filtering depends strongly on the bandwidth of the dyadic behavioral interactions. Can this also apply to birdsong culture? Many songbirds are territorial, and their communication networks are naturally spatial. In species where the song repertoire size is small and song copying is highly accurate, the situation might be similar to that of Klar and Shmargad (2016). If this analogy is correct, the spatial network topology is likely to filter out rare syllable types over iterations, potentially imploding the local dialect. In songbird species where song repertoire is rich, or when error and improvisation rates are high, the scenario might be more similar to that proposed by Centola and Baronchelli (2015), where the spatial communication network might promote instability in the shared repertoire.

Note that according to the hypotheses presented above, an evolutionary change in the bandwidth of signing behavior may flip the effect of the spatial network – turning it from a converging force into a diverging force. Can this be beneficial for the birds? Assuming that local song dialects are advantageous, what evolutionary forces may maintain them in different scenarios? The evolution of different territorial, dispersal or migratory behaviors could potentially alter network topology between the spatial and homogeneous extremes, hence counterbalancing converging or diverging tendencies to maintain a stable polymorphic song dialect. However, changes in territorial or migratory behaviors have strong ecological consequences. Evolutionary changes at the level of song learning are likely to be less costly. For example, the evolution of higher improvisation rates or of a mechanism for negative (balancing) frequency-dependent filtering as we discussed earlier could counteract convergence due to spatial network topology. Therefore, relatively minor changes in features of song learning could potentially balance converging and diverging forces to retain a stable and polymorphic song dialect.

In sum, although birdsong dialects, social conventions, and public opinion, are studied by disjoined scientific disciplines, it might be useful to integrate knowledge across them. In the case of birdsong culture, the learning process is more readily available for mechanistic investigation, while human studies provide the opportunity to investigate cultural mechanisms at the social network level, where topology can be experimentally controlled. In both cases, the manner in which signals are compressed and filtered – either during learning or while traveling through the social network – can shape cultures by shifting the balance between convergence and divergence. We think that understanding cultures across the levels of dyadic social interaction and social networks may have far-reaching implications. We conclude by briefly outlining such implications, focusing on how manipulation of signal compression and filtering could be used to promote stable polymorphism in online communication systems.

During the past decade, social media and crowd-sourcing platforms have transformed how public opinion is shared, guiding everyday decisions from picking a restaurant to expressing support by ‘liking’ posts and signing petitions. Looking at online platforms through the lens of cultural stability and polymorphism, they often seem unbalanced: either too chaotic, or highly biased and monolithic. Such outcomes could be unintended social consequences of recent advances in communication technology. Centola and Baronchelli (2015) suggested that the increase in social connectedness via social media could potentially facilitate the convergence of public opinion among people who do not even know that they are implicitly coordinating with one another. Other studies show that such convergence can induce a phase transition (or non-linearity), shifting public opinion from moderate views towards extremism (Ramos et al., 2015), causing community disconnection (Gil and Zanette, 2006) and echo-chamber effects, particularly in domains with high emotional salience (Cowan, 2014; Jasny et al., 2015; Pentland, 2014). We will conclude by presenting a coarse outline for technical approaches to counteract imbalances in online communication systems.

We noted earlier that converging and diverging forces can shape cultures at two levels: social learning and network topology. It is rarely practical (or desired) to modify the social networks of citizens engaged in online communication platforms. However, information-sharing protocols are easy to manipulate, and such manipulations can potentially influence how social learning spreads and accumulates. Take, for example, the Klar and Shmargad (2016) study we discussed earlier: they found that spatial social networks filtered out under-represented viewpoints, whereas small-world networks promoted their survival. Instead of manipulating social network topology, online platforms could manipulate the way information is presented to users. For instance, social media sites automatically organize each post into categories, or into topics that emerge from trends across posts (Hong and Davison, 2010). We wonder if different strategies for clustering under-represented viewpoints could potentially affect their survival rate. Is there any simple equivalence between the effects of manipulating the topology of information presentation versus network topology with respect to cultural outcomes?

Consider the design of online petition systems: the US White House petition platform is designed to efficiently filter petitions: a petition must receive >100,000 signatures within 30 days to be considered. The platform provides no mechanism of similar petitions to merge or to evolve. We suspect that different methods for compression and filtering information may result in very different cultures. Stable polymorphic cultures cannot easily emerge in highly competitive and rapid turnover platforms, or in ‘timeline’-based social media platforms. However, there is some preliminary evidence that regulating the filtering and presentation of information in online reviewing platforms can induce incremental improvement in public service quality via distributed social learning: crowd-sourced reviewing platforms are highly popular (Mackiewicz, 2009; Zhu and Zhang, 2010). For example, Yelp owns a database with about 100-million anonymous restaurant reviews, which is used by about 135 million monthly visitors. Mean scores are presented by star rating, and even a moderate change from 3.5 to 4 stars in Yelp increases the chances of a restaurant being booked by about 19% (Anderson and Magruder, 2012). Clearly, biases and fraud (Luca and Zervas, 2013; Racherla and Friske, 2012) are serious concerns in such ‘learning from the crowd’ platforms. Even with potential fraud set aside, cumulative star rating provides little opportunity for social learning across clients and providers of services: for a new venue, a random (or malicious) cluster of a few negative reviews is likely to drop the mean score strongly enough to ruin a business. For a highly popular venue, the unresponsiveness of the cumulative score to incremental changes may fail to provide adequate motivation for enhanced efforts. We suggest that the ubiquitous 5-star rating system compresses rating information too strongly. Evidence from a recent field study suggests that adjusting the level of temporal granularity (i.e. compression) of the presentation of service rating can potentially keep a crowd-sourced reviewing platform in an agile state, where client feedback can drive incremental improvement in services over time scales of years (Tchernichovski et al., 2016). Instead of a star rating, the study presented service clients with short-term trends of client satisfaction with service outcome. Those trends were presented on the service request forms, making them apparent to both service clients and service providers. Presenting trends allowed regulation of the compression level: larger bins pooled more data, providing more robust estimates at the cost of lower sensitivity to change, and vice versa. For social learning to be efficient, the granularity of the trends, namely their sensitivity to changes in service quality, should correspond to time scales of service responsiveness. That is, trends should be presented in a temporal resolution that can allow service workers sufficient time to adjust, and prevent discouragement. The positive outcome of the field study, although on a small scale, suggested that synchronizing social learning across networks of service workers and clients might promote a culture of engagement in service improvement.

Overall, we present here a preliminary framework for studying how features of social learning and of distributed communication systems may shape culture. We integrate findings and ideas across the scientific disciplines of birdsong and human social networks with the goal of outlining some common threads. For birdsong culture, we suggest that the results of large-scale human social network studies provide a framework for understanding how features of territorial networks may cause convergence or divergence of song dialects. For human online cultures, we suggest that it may be useful to consider features of birdsong learning, which have been optimized over millions of generations to give rise to stable polymorphic cultures. Experimenting with implementation of similar features in online communication systems could potentially facilitate the design of more stable and balanced information systems, which can potentially promote distributed self-governance.

We thank Lucas C. Parra, Hernan Makse, Xanadu Halkias and Julia Hyland Bruno for comments and suggestions.

Funding

Supported by grants from the US National Institutes of Health (DC004722-18) and from the US National Science Foundation (1261872). Deposited in PMC for release after 12 months.

Akçay
,
Ç.
,
Tom
,
M. E.
,
Campbell
,
S. E.
and
Beecher
,
M. D.
(
2013
).
Song type matching is an honest early threat signal in a hierarchical animal communication system
.
Proc. R. Soc. B Biol. Sci.
280
,
20122517
.
Anderson
,
M.
and
Magruder
,
J.
(
2012
).
Learning from the crowd: regression discontinuity estimates of the effects of an online review database*
.
Econ. J.
122
,
957
-
989
.
Aronov
,
D.
,
Andalman
,
A. S.
and
Fee
,
M. S.
(
2008
).
A specialized forebrain circuit for vocal babbling in the juvenile songbird
.
Science
320
,
630
-
634
.
Badii
,
R.
and
Politi
,
A.
(
1999
).
Complexity: Hierarchical structures and scaling in physics.Cambridge Nonlinear Science Series, 6.
New York
:
Cambridge University Press
.
Benichov
,
J.
,
Benezra
,
S.
,
Vallentin
,
D.
,
Globerson
,
E.
,
Long
,
M.
and
Tchernichovski
,
O.
(
2015
).
The forebrain song system mediates predictive call timing in female and male zebra finches
.
Curr. Biol.
26
,
309
-
318
.
Brennan
,
S.
and
Clark
,
H.
(
1996
).
Conceptual pacts and lexical choice in conversation
.
J. Exp. Psychol.
22
,
1482
.
Carr
,
J. W.
,
Smith
,
K.
,
Cornish
,
H.
and
Kirby
,
S.
(
2016
).
The cultural evolution of structured languages in an open-ended, continuous world
.
Cogn. Sci.
doi:10.1111/cogs.12371.
Centola
,
D.
and
Baronchelli
,
A.
(
2015
).
The spontaneous emergence of conventions: an experimental study of cultural evolution
.
Proc. Natl. Acad. Sci. USA
112
,
1989
-
1994
.
Chen
,
Y.
,
Matheson
,
L. E.
and
Sakata
,
J. T.
(
2016
).
Mechanisms underlying the social enhancement of vocal learning in songbirds
.
Proc. Natl. Acad. Sci. USA
113
,
6641
-
6646
.
Cowan
,
S. K.
(
2014
).
Secrets and misperceptions: the creation of self-fulfilling illusions
.
Sociol. Sci.
1
,
466
.
Derégnaucourt
,
S.
and
Gahr
,
M.
(
2013
).
Horizontal transmission of the father's song in the zebra finch (Taeniopygia guttata)
.
Biol. Lett.
9
,
20130247
.
Doupe
,
A. J.
and
Kuhl
,
P. K.
(
1999
).
Birdsong and human speech: common themes and mechanisms
.
Annu. Rev. Neurosci.
22
,
567
-
631
.
Dunbar
,
R.
(
2004
).
Gossip in evolutionary perspective
.
Rev. Gen. Psychol.
Elie
,
J. E.
and
Theunissen
,
F. E.
(
2015
).
Meaning in the avian auditory cortex: neural representation of communication calls
.
Eur. J. Neurosci.
41
,
546
-
567
.
Fehér
,
O.
,
Wang
,
H.
,
Saar
,
S.
,
Mitra
,
P. P.
and
Tchernichovski
,
O.
(
2009
).
De novo establishment of wild-type song culture in the zebra finch
.
Nature
459
,
564
-
568
.
Fehér
,
O.
,
Ljubičić
,
I.
,
Suzuki
,
K.
,
Okanoya
,
K.
and
Tchernichovski
,
O.
(
2017
).
Statistical learning in songbirds: from self-tutoring to song culture
.
Philos. Trans. R. Soc. B Biol. Sci.
372
,
20160053
.
Fitzpatrick
,
M. J.
,
Feder
,
E.
,
Rowe
,
L.
and
Sokolowski
,
M. B.
(
2007
).
Maintaining a behaviour polymorphism by frequency-dependent selection on a single gene
.
Nature
447
,
210
-
212
.
Galantucci
,
B.
(
2005
).
An experimental study of the emergence of human communication systems
.
Cogn. Sci.
29
,
737
-
767
.
Garcia
,
N. C.
,
Arrieta
,
R. S.
,
Kopuchian
,
C.
and
Tubaro
,
P. L.
(
2015
).
Stability and change through time in the dialects of a Neotropical songbird, the Rufous-collared Sparrow
.
Emu
115
,
309
.
Garland
,
E. C.
,
Goldizen
,
A. W.
,
Rekdahl
,
M. L.
,
Constantine
,
R.
,
Garrigue
,
C.
,
Hauser
,
N. D.
,
Poole
,
M. M.
,
Robbins
,
J.
and
Noad
,
M. J.
(
2011
).
Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale
.
Curr. Biol.
21
,
687
-
691
.
Garrod
,
S.
and
Anderson
,
A.
(
1987
).
Saying what you mean in dialogue: a study in conceptual and semantic co-ordination
.
Cognition
27
,
181
-
218
.
Garrod
,
S.
,
Fay
,
N.
,
Lee
,
J.
,
Oberlander
,
J.
and
MacLeod
,
T
. (
2007
).
Foundations of representation: where might graphical symbol systems come from?
Cog. Sci.
31
,
961
-
987
.
Gil
,
S.
and
Zanette
,
D. H.
(
2006
).
Coevolution of agents and networks: opinion spreading and community disconnection
.
Phys. Lett. A
356
,
89
-
94
.
Gustafson
,
G. E.
and
Harris
,
K. L.
(
1990
).
Women's responses to young infants’ cries
.
Dev. Psychol.
26
,
144
.
Hassler
,
W. W.
and
Hogarth
,
W. T.
(
1977
).
The growth and culture of dolphin, Coryphaena hippurus, in North Carolina
.
Aquaculture
12
,
115
-
122
.
Henrich
,
J.
(
2001
).
Cultural transmission and the diffusion of innovations: adoption dynamics indicate that biased cultural transmission is the predominate force in behavioral change
.
Am. Anthropol.
103
,
992
-
1013
.
Heyes
,
C. M.
and
Galef
,
B. G.
Jr
(
1996
).
Social Learning In Animals: The Roots of Culture.
San Diego
:
Academic Press
.
Hong
,
L.
and
Davison
,
B.
(
2010
).
Empirical study of topic modeling in twitter
.
Proceedings of the first workshop on social media analytics, pp. 80-88. ACM
.
Jackson
,
M. O.
(
2008
).
Social and Economic Networks, Vol. 3
.
Princeton
:
Princeton University Press
.
Jasny
,
L.
,
Waggle
,
J.
and
Fisher
,
D. R.
(
2015
).
An empirical examination of echo chambers in US climate policy networks
.
Nat. Clim. Chang.
5
,
782
-
786
.
Jenkins
,
P. F.
(
1978
).
Cultural transmission of song patterns and dialect development in a free-living bird population
.
Anim. Behav.
26
,
50
-
78
.
Kearns
,
M.
,
Judd
,
S.
,
Tan
,
J.
and
Wortman
,
J.
(
2009
).
Behavioral experiments on biased voting in networks
.
Proc. Natl. Acad. Sci. USA
106
,
1347
-
1352
.
Kirby
,
S.
,
Cornish
,
H.
and
Smith
,
K.
(
2008
).
Cumulative cultural evolution in the laboratory: an experimental approach to the origins of structure in human language
.
Proc. Natl. Acad. Sci. USA
105
,
10681
-
10686
.
Klar
,
S.
and
Shmargad
,
Y.
(
2016
).
The Effect of Network Structure on Preference Formation
. In
NYU CESS 9th Annual Experimental Political Science Conference
,
New York, New York
,
USA
.
Knörnschild
,
M.
(
2014
).
Vocal production learning in bats
.
Curr. Opin. Neurobiol.
28
,
80
-
85
.
Lachlan
,
R. F.
,
van Heijningen
,
C. A. A.
,
ter Haar
,
S. M.
and
ten Cate
,
C.
(
2016
).
Zebra Finch song phonology and syntactical structure across populations and continents—a computational comparison
.
Front. Psychol.
7
,
980
.
Laland
,
K. N.
and
Hoppitt
,
W.
(
2003
).
Do animals have culture?
.
Evol. Anthropol.
12
,
150
-
159
.
Levelt
,
W. J. M.
and
Kelter
,
S.
(
1982
).
Surface form and memory in question answering
.
Cogn. Psychol.
14
,
78
-
106
.
Lipkind
,
D.
and
Tchernichovski
,
O.
(
2011
).
Quantification of developmental birdsong learning from the subsyllabic scale to cultural evolution
.
Proc. Natl. Acad. Sci. USA
108
Suppl.
,
15572
-
15579
.
Liu
,
W.-C.
and
Nottebohm
,
F.
(
2007
).
A learning program that ensures prompt and versatile vocal imitation
.
Proc. Natl. Acad. Sci. USA
104
,
20398
-
20403
.
Luca
,
M.
and
Zervas
,
G.
(
2013
).
Fake it till you make it: reputation, competition, and yelp review fraud
.
Manag. Sci.
.
MacDougall-Shackleton
,
E. A.
and
MacDougall-Shackleton
,
S. A.
(
2001
).
Cultural and genetic evolution in mountain white-crowned sparrows: song dialects are associated with population structure
.
Evolution
55
,
2568
-
2575
.
Mackiewicz
,
J.
(
2009
).
Assertions of expertise in online product reviews
.
J. Bus. Tech. Commun.
24
,
3
-
28
.
Mammen
,
D. L.
and
Nowicki
,
S.
(
1981
).
Individual differences and within-flock convergence in chickadee calls
.
Behav. Ecol. Sociobiol.
9
,
179
-
186
.
Maney
,
D. L.
,
MacDougall-Shackleton
,
E. A.
,
MacDougall-Shackleton
,
S. A.
,
Ball
,
G. F.
and
Hahn
,
T. P.
(
2003
).
Immediate early gene response to hearing song correlates with receptive behavior and depends on dialect in a female songbird
.
J. Comp. Physiol. A Neuroethol. Sens. Neural. Behav. Physiol.
189
,
667
-
674
.
Marler
,
P.
and
Nelson
,
D.
(
1992
).
Neuroselection and song learning in birds: species universals in a culturally transmitted behavior
.
Semin. Neurosci.
4
,
415
-
423
.
Marler
,
P.
and
Pickert
,
R.
(
1984
).
Species-universal microstructure in the learned song of the swamp sparrow (Melospiza georgiana)
.
Anim. Behav.
32
,
673
-
689
.
Marler
,
P.
and
Tamura
,
P. M.
(
1962
).
Song “Dialects” in three populations of white-crowned sparrows
.
Condor
64
,
368
-
377
.
Morrison
,
R. G.
and
Nottebohm
,
F.
(
1993
).
Role of a telencephalic nucleus in the delayed song learning of socially isolated zebra finches
.
J. Neurobiol.
24
,
1045
-
1064
.
Mundinger
,
P. C.
(
1970
).
Vocal imitation and individual recognition of finch calls
.
Science
168
,
480
-
482
.
Nettle
,
D.
and
Dunbar
,
R. I. M.
(
1997
).
Social markers and the evolution of reciprocal exchange
.
Curr. Anthropol.
38
,
93
-
99
.
Nowicki
,
S.
,
Searcy
,
W. A.
and
Peters
,
S.
(
2002
).
Brain development, song learning and mate choice in birds: a review and experimental test of the “nutritional stress hypothesis”
.
J. Comp. Physiol. A Neuroethol. Sens. Neural. Behav. Physiol.
188
,
1003
-
1014
.
Oller
,
D. K.
,
Wieman
,
L. A.
,
Doyle
,
W. J.
and
Ross
,
C.
(
2008
).
Infant babbling and speech
.
J. Child Lang.
3
,
1
-
11
.
Oller
,
D. K.
,
Buder
,
E. H.
,
Ramsdell
,
H. L.
,
Warlaumont
,
A. S.
,
Chorna
,
L.
and
Bakeman
,
R.
(
2013
).
Functional flexibility of infant vocalization and the emergence of language
.
Proc. Natl. Acad. Sci. USA
110
,
6318
-
6323
.
O'Loghlen
,
A. L.
and
Rothstein
,
S. I.
(
1995
).
Culturally correct song dialects are correlated with male age and female song preferences in wild populations of brown-headed cowbirds
.
Behav. Ecol. Sociobiol.
36
,
251
-
259
.
Pentland
,
A.
(
2014
).
Social Physics: How Good Ideas Spread-The Lessons from a New Science
.
Penguin Publishing Group
.
Pickering
,
M. J.
and
Garrod
,
S.
(
2006
).
Alignment as the basis for successful communication
.
Res. Lang. Comput.
4
,
203
-
228
.
Podos
,
J.
and
Warren
,
P. S.
(
2007
).
The evolution of geographic variation in birdsong
.
Adv. Study Behav.
37
,
403
-
458
.
Price
,
P. H.
(
1979
).
Developmental determinants of structure in zebra finch song
.
J. Comp. Physiol. Psychol.
93
,
260
.
Racherla
,
P.
and
Friske
,
W.
(
2012
).
Perceived “usefulness” of online consumer reviews: an exploratory investigation across three services categories
.
Electron. Commer. Res. Appl.
11
,
548
-
559
.
Ramos
,
M.
,
Shao
,
J.
,
Reis
,
S. D. S.
,
Anteneodo
,
C.
,
Andrade
,
J. S.
,
Havlin
,
S.
,
Makse
,
H. A.
,
Castellano
,
C.
,
Fortunato
,
S.
,
Simon
,
B.
, et al. 
(
2015
).
How does public opinion become extreme?
Sci. Rep.
5
,
10032
.
Reiss
,
D.
(
2011
).
The Dolphin in the Mirror: Exploring Dolphin Minds and Saving Dolphin Lives
.
New York: Houghton Mifflin Harcourt
.
Rendall
,
D.
,
Owren
,
M. J.
and
Ryan
,
M. J.
(
2009
).
What do animal signals mean?
Anim. Behav.
78
,
233
-
240
.
Salganik
,
M. J.
,
Dodds
,
P. S.
and
Watts
,
D. J.
(
2006
).
Experimental study of inequality and unpredictability in an artificial cultural market
.
Science
311
,
854
-
856
.
Sasahara
,
K.
,
Cody
,
M. L.
,
Cohen
,
D.
and
Taylor
,
C. E.
(
2012
).
Structural design principles of complex bird songs: a network-based approach
.
PLoS ONE
7
,
e44436
.
Scott-Phillips
,
T. C.
and
Kirby
,
S.
(
2010
).
Language evolution in the laboratory
.
Trends Cogn. Sci.
14
,
411
-
417
.
Snowdon
,
C. T.
and
Hausberger
,
M.
(
1997
).
Social Influences on Vocal Development
.
Cambridge
:
Cambridge University Press
.
Soha
,
J. A.
and
Marler
,
P.
(
2000
).
A species-specific acoustic cue for selective song learning in the white-crowned sparrow
.
Anim. Behav.
60
,
297
-
306
.
Stewart
,
A. M.
,
Lewis
,
G. F.
,
Heilman
,
K. J.
,
Davila
,
M. I.
,
Coleman
,
D. D.
,
Aylward
,
S. A.
and
Porges
,
S. W.
(
2013
).
The covariation of acoustic features of infant cries and autonomic state
.
Physiol. Behav.
120
,
203
-
210
.
Stoddard
,
P. K.
,
Beecher
,
M. D.
,
Campbell
,
S. E.
and
Horning
,
C. L.
(
1992
).
Song-type matching in the song sparrow
.
Can. J. Zool.
70
,
1440
-
1444
.
Tchernichovski
,
O.
and
Marcus
,
G.
(
2014
).
Vocal learning beyond imitation: mechanisms of adaptive vocal development in songbirds and human infants
.
Curr. Opin. Neurobiol.
28
,
42
-
47
.
Tchernichovski
,
O.
and
Nottebohm
,
F.
(
1998
).
Social inhibition of song imitation among sibling male zebra finches
.
Proc. Natl. Acad. Sci. USA
95
,
8951
-
8956
.
Tchernichovski
,
O.
,
Lints
,
T.
,
Mitra
,
P. P.
and
Nottebohm
,
F.
(
1999
).
Vocal imitation in zebra finches is inversely related to model abundance
.
Proc. Natl. Acad. Sci. USA
96
,
12901
-
12904
.
Tchernichovski
,
O.
,
Lints
,
T. J.
,
Deregnaucourt
,
S.
,
Cimenser
,
A.
and
Mitra
,
P. P.
(
2004
).
Studying the song development process: rationale and methods
.
Ann. N. Y. Acad. Sci.
1016
,
348
-
363
.
Tchernichovski
,
O.
,
Brinkmann
,
P.
,
Fimiarz
,
D.
,
Halkias
,
X.
,
Parra
,
L.
and
Conley
,
D.
(
2016
).
Optimizing social learning in voluntary rating systems for public services: real and virtual world experiments in distributed governance
. In
NYU CESS 9th Annual Experimental Political Science Conference
,
New York, New York
,
USA
.
Ullmann-Margalit
,
E.
(
2015
).
The Emergence of Norms
.
Oxford
:
Oxford University Press
.
van de Waal
,
E.
,
Borgeaud
,
C.
and
Whiten
,
A.
(
2013
).
Potent social learning and conformity shape a wild primate's foraging decisions
.
Science
340
,
483
-
485
.
Watts
,
D. J.
and
Strogatz
,
S. H.
(
1998
).
Collective dynamics of “small-world” networks
.
Nature
393
,
440
-
442
.
Weiss
,
M.
,
Hultsch
,
H.
,
Adam
,
I.
,
Scharff
,
C.
and
Kipper
,
S.
(
2014
).
The use of network analysis to study complex animal communication systems: a study on nightingale song
.
Proc. R. Soc. B Biol. Sci.
281
,
20140460
.
West
,
M. J.
and
King
,
A. P.
(
1988
).
Female visual displays affect the development of male song in the cowbird
.
Nature
334
,
244
-
246
.
Zann
,
R. A.
(
1996
).
The Zebra Finch: A Synthesis of Field and Laboratory Studies
. 1st edn.
Oxford
:
Oxford University Press
.
Zhu
,
F.
and
Zhang
,
X. (Michael)
(
2010
).
Impact of online consumer reviews on sales: the moderating role of product and consumer characteristics
.
J. Mark.
74
,
133
-
148
.

Competing interests

The authors declare no competing or financial interests.