Traditionally, the ultrasonic vocal repertoire of rats is differentiated into 22 kHz and 50 kHz calls, two categories that contain multiple different call types. Although both categories have different functions, they are sometimes produced in the same behavioral context. Here, we investigated the peripheral mechanisms that generate sequences of calls from both categories. Male rats, either sexually experienced or naïve, were exposed to an estrous female. The majority of sexually naïve male rats produced 22 kHz and 50 kHz calls on their first encounter with a female. We recorded subglottal pressure and electromyographic activity of laryngeal muscles and found that male rats sometimes concatenate long 22 kHz calls and 50 kHz trill calls into an utterance produced during a single breath. The qualitatively different laryngeal motor patterns for both call types were produced serially during the same breathing cycle. The finding demonstrates flexibility in the laryngeal–respiratory coordination during ultrasonic vocal production, which has not been previously documented physiologically in non-human mammals. Since only naïve males produced the 22 kHz-trills, it is possible that the production is experience dependent.
Understanding how vocal utterances are assembled and combined is of interest to the investigation of vocal communication in mammals, since these mechanisms affect call rate, call combinations and call complexity and vocal repertoires. All these variables are functionally important in mammalian vocal behavior, including human speech (e.g. tenCate and Okanoya, 2012; Zuberbühler, 2015). Call production requires the precise control of laryngeal and breathing movements (Jürgens, 2009), as well as the coordination of pharyngeal and oral musculature (Hauser and Ybarra, 1994). The concatenation of simple calls into more complex, composite vocalizations requires more sophisticated motor control, in particular over breathing movements, because vocal units can be produced during a single breath or during separate exhalations (Franz and Goller, 2002). Comparative work suggests that call concatenation into a single breath is employed by horseshoe bats (Rübsamen, Betz, 1986), squirrel monkeys (Häusler, 2000) as well as rats (Riede, 2014). While the previous study in rats focused on one particular group of call types, the 50 kHz calls, the current experiments investigated how different call categories, 22 kHz and 50 kHz calls, are assembled.
Rats produce a large repertoire of ultrasonic vocalizations (USVs) (e.g. Wright et al., 2010; Brudzynski, 2013) that play important roles in their social interactions (e.g. Brudzynski, 2009; Willadsen et al., 2014). The rat model has allowed comprehensive recording of peripheral vocal movements (Riede, 2011). Parameters such as nasal airflow or subglottal pressure provide insight into how central motor commands are translated into respiratory movements (Riede, 2011, 2013, 2014; Hegoburu et al., 2011; Sirotin et al., 2014). The coarse temporal structure of ultrasonic call bouts is determined by respiratory movements (Sirotin et al., 2014) but subglottal pressure is further modulated by laryngeal muscles (Riede, 2011). As in many other mammals (Larson and Kistler, 1984; Luschei et al., 2006), the laryngeal valve in rats contributes to laryngeal resistance, and glottal geometry is crucial for vocal production (Riede, 2013). Recordings of subglottal pressure therefore provide a physiological correlate reflecting both respiratory motor activity as well as activity of laryngeal muscles that are involved in flow control.
Rats produce two categories of ultrasonic calls, 22 kHz and 50 kHz calls, which have different functions (e.g. Knutson et al., 2002; Burgdorf et al., 2008; Brudzynski, 2009). In anticipation of pain or danger, rats often produce long bouts of calls in the 19–28 kHz range with little or no frequency modulation, which are referred to as ‘22 kHz calls’. The playback of 22 kHz calls causes freezing behavior (Endres et al., 2007; Allen et al., 2007; Bang et al., 2008; Kim et al., 2010; Parsana et al., 2012a,b). Vocalizations with fundamental frequencies between 28 and 90 kHz are collectively referred to as ‘50 kHz calls’. Production of 50 kHz calls is associated with mating and other social interactions or the expectation of reward (Burgdorff et al., 2008). Playbacks of 50 kHz calls induce approach behavior in both male and female rats, promoting social contact (Seffer et al., 2014; Willadsen et al., 2014). During the encounter with an unfamiliar estrous female prior to mating, a male's vocalizations can contain call types from both the 22 kHz and 50 kHz call category (e.g. Geyer and Barfield, 1978; Barfield et al., 1979; McGinnis and Vakulenko, 2003; Snoeren and Ågmo, 2014). This context therefore provides the possibility for investigating the motor patterns that give rise to utterances composed of functionally different call types.
MATERIALS AND METHODS
The data presented here were obtained from a total of 12 male rats. Six pairs of male littermates (N=12) (purchased from Charles River Labs, Wilmington, MA) weaned at 21 days were co-housed in standard rodent cages after weaning, during shipping and after arrival at Midwestern University. Littermates were separated on postnatal day (P)42 and housed singly thereafter. One male of each pair (‘experienced male’) was co-housed with a female for 12 h on P46. The second male was not exposed to a female (‘naïve male’).
Setup and experiments
Experiments were conducted on P54. Two standard rodent cages were connected by a 12-cm-diameter PVC tube (Fig. 1). The tube could be entered only from one cage (female cage) about two-thirds down to the second cage. An 8 mm mesh prevented further movement. Three large holes (5 mm diameter) in the wall of the male cage allowed a rat to push through with the snout. The male and female were able to have snout–snout contact through the holes and the mesh. Sound was recorded over the male cage.
For each experiment, a male was habituated for 10 min in the male cage and then recorded for 5 min with an ultrasonic microphone (Avisoft Bioacoustics, CM16/CMPA-5V) placed 5 cm over the center of the male cage, while the female cage was empty. Then an estrous female was placed in the female cage, and the male was recorded for an additional 5 min. Vocal activity increased dramatically after the female was added. At the end of the second 5 min recording period the female was placed into the male cage for 2 min.
For comparison, 22 kHz calls were recorded in an aversive context during a second experiment on P60. The male was placed in the test cage, habituated for 10 min, and then the experimenter blew five short air puffs at the unrestrained animal through a long narrow tube. Calls were triggered by applying five mild air puffs to the facial region, which function as an aversive stimulus (Knapp and Pohorecky, 1995). The animals started vocalizing within a few seconds of the start of the air puffs. All vocalizations were recorded for 5 min.
Measurement of vocal motor patterns
Between days 70 and 85, we investigated underlying movements of vocal behavior successfully in 10 of the 12 animals (4 experienced and 6 naïve males). Subglottal pressure was measured by implanting a stainless steel tube in the upper third of the trachea. The tube was connected to a pressure transducer (model FHM-02PGR-02; Fujikura) housed in a backpack. Electromyographic (EMG) activity of one intrinsic laryngeal muscle (thyroarytenoid muscle) was recorded with bipolar silver electrodes. The calibrated pressure signal and EMG activity were recorded simultaneously with the sound signal and acquired through an NiDAQ 6212 acquisition device, sampled at 200 kHz, and saved as uncompressed files using Avisoft Recorder v.3.4.2 software (Avisoft Bioacoustics, Berlin, Germany).
Procedures were in accordance with National Institutes of Health guidelines for experiments involving vertebrate animals and were approved by the Institutional Animal Care and Use Committee at Midwestern University, Glendale, Arizona.
Vocal behavior was analyzed for call type (following spectrographic description by Wright et al., 2010), vocal rate, call duration and call fundamental frequency using PRAAT software; www.praat.org. EMG recordings were differentially amplified (model EX4-400, Dagan Corporation), bandpass filtered (100–3000 Hz), full-wave rectified, low-pass filtered (200 Hz) and normalized to a rat's maximum EMG activity recorded during swallowing (using PRAAT software; www.praat.org).
We used different acoustic parameters (fundamental frequency, call duration, trill rate) to describe a call type. Similarity between calls produced alone or as segment of a concatenated call (fixed effect predictor) was assessed using mixed-effect linear regression using the individual rat as random effect. A false-discovery rate adjustment was performed to keep the false discovery rate at the nominal <0.05 level. The extent of an inhalation might be related to the type or the duration of a call that is produced subsequently. An ANCOVA was used to test whether inhalation duration (dependent variable) before different call types (independent variable; 22 kHz, 50 kHz and 22 kHz trill calls) was similar, while controlling for absolute call duration (covariate). Finally, to determine whether production rates of different call types (22 kHz, 50 kHz and 22 kHz trill calls) were related to overall vocal activity we used Spearman rank correlation analysis. Tests were performed in SPSS (v.22; Chicago, IL, USA).
To confirm that males produce both 22 kHz and 50 kHz calls serially in the same context, we exposed males to female rats in a simulation of the pre-mating context and counted how many 22 kHz calls and 50 kHz calls were produced. Overall calling activity during the 5 min premating context (330±43 calls in 5 min; N=12 rats, mean±s.d.) was higher than during the habituation (185±36 calls in 5 min; N=12 rats) (data for experienced and naïve males pooled; N=12 rats; t=−4.3; P=0.001). Calling activity during the premating context was not different between the experienced and naïve rats (390±71 and 270±42 calls in 5 min in experienced and naïve rats, respectively; N1,2=6, t=1.44, P=0.18; Table 1). All sexually naïve males produced 22 kHz calls and 50 kHz calls during exposure to females. Five out of 6 naïve rats also produced a call type described here for the first time. The call was composed of a long 22 kHz call component and a 50 kHz trill call component. Sometimes the trill component preceded the 22 kHz component, but in most cases, the trill component followed the 22 kHz component (χ2=33.5, d.f.=4, P<0.05) (Fig. 2). The number of composite calls amounted to 0.4 and 12.6% (of calls produced in 5 min).
Next, we investigated whether the two components of the composite 22 kHz call–50 kHz trill (hereafter ‘22 kHz-trill’) spectrographically resemble 22 kHz calls and 50 kHz trill calls. Call duration, center and mean fundamental frequency were compared between 22 kHz calls produced alone and in the 22 kHz-trill. The comparison revealed small but consistent differences in call duration (mixed effect linear regression, z=10.2, P<0.001), center (z=5.7, P<0.001) and mean fundamental frequency (z=6.0, P<0.001) (Table 2). The trill component in the composite calls resembled 50 kHz trills in trill rate (z=1.5, P=0.13) and trill duration (z=0.58, P=0.56), but mean fundamental frequency was lower (z=18.0, P<0.001) (Table 3).
Next, we investigated whether call types that appear spectrographically concatenated were produced during a single or two subsequent breaths. Close visual inspection of spectrograms of the 22 kHz-trill showed that there can be a silent period or an uninterrupted transition between the 22 kHz and the trill segments. Two calls can be combined using two alternative breathing patterns. Each call could be produced during a distinct exhalation, perhaps separated only by a rapid inhalation, a so-called minibreath. A minibreath can still produce the appearance of a composite call on a spectrogram (Hartley and Suthers, 1989). Alternatively, the two calls are produced during a single, extended breath and would therefore motorically constitute a different third utterance. The following experiments were performed in the 6 naïve rats between P70 and P85, i.e. 14 to 29 days after the initial experiment. We recorded subglottal pressure and sound in 6 males. Fig. 3 illustrates a bout of calls identified as 22 kHz, 50 kHz and 22 kHz-trills. The 22 kHz-trills were uttered as part of such bouts. The 6 animals produced a total of 96 22 kHz-trill calls; all were produced during a single breath.
We also used these data to test how both breathing and laryngeal movements for two call components in the composite call compared with movements during 22 kHz calls and 50 kHz trills produced as separate calls. Subglottal pressure at mid-call ranged between 0.8 and 1.2 kPa in the 22 kHz component and between 1.0 and 1.4 kPa during the trill component, falling within ranges reported for 22 kHz calls and for 50 kHz trills (Riede, 2013). In three of the six males, both subglottal pressure and laryngeal muscle activity were recorded. The composite calls were produced by two different laryngeal motor patterns (Fig. 4). EMG activity was tonic during the 22 kHz component and reached 20–40% of maximum muscle activity. EMG activity was phasic and reached high amplitudes (100% relative to maximum activity during swallowing) during composite calls. Both observations agree with previous findings that laryngeal muscle activity and subglottal pressure are tightly associated with fundamental frequency features in ultrasonic calls (Riede, 2013, 2014).
Producing more syllables during a single exhalation requires an adjustment of the inhaled air volume during human speech (Whalen and Kinsella-Shaw, 1997). Research in rats has previously shown that the concatenation of different 50 kHz calls is sometimes associated with preceding augmented breaths (or ‘sighs’) (Riede, 2014). Here, we investigated whether 22 kHz-trills were associated with adjustments to breathing movements prior to their production. Visual inspection of the pressure signal confirmed that composite calls in all 6 males are sometimes preceded by deeper and longer inhalations. On average, the duration of the preceding inhalation was significantly longer in 22 kHz-trills than in 22 kHz calls without trills (Table 4), even when total call duration was considered as a co-variable (ANCOVA, F1,126=7.5; P<0.001).
Finally, we address the question of what triggers the concatenation of 22 kHz and 50 kHz calls. The occurrence of composite calls could be related to overall call rate and may be a mechanism to increase call rate by reducing the time required for inhalations. As a proxy for call rate, we used the total number of calls produced during the 5 minute exposure to a female rat. Neither the correlation (Spearman rank) between the total number of calls and the number of 22 kHz-trills (P=−0.6; P=0.2; N=6) nor between the total number of calls and the number of 22 kHz calls (P=−0.35; P=0.5; N=6) was significant. However, interestingly, large numbers of composite calls were produced by two animals that showed a low overall calling rate (Fig. 5). The correlation between the total number of calls and the number of 50 kHz calls was significant (P=0.94; P<0.01; N=6) (Fig. 5).
Results presented here inform our understanding of nonhuman vocal production by illustrating the underlying mechanism of producing a complex call type in a nonhuman mammal. We have previously shown the recombinatorial abilities of rats to produce different call types in the 50 kHz category (Riede, 2014). Male rats can also concatenate two call types from different categories (22 kHz and 50 kHz calls), with opposed affective significance (aversive versus appetitive) into a single new utterance. The two call types are produced by two qualitatively different motor actions of larynx muscles (Riede, 2013). Here, we show that the two components are sometimes concatenated into a single utterance produced during a single breath. Breathing movements preceding the 22 kHz-trill are often deeper and longer. The altered breath preceding the 22 kHz-trill may provide adjustment for a greater air volume required to produce a concatenated utterance, a phenomenon well-known in human speech (e.g. Hoit et al., 1989; Winkworth et al., 1995; Whalen and Kinsella-Shaw, 1997). Overall breathing movements during the production of 22 kHz and 50 kHz calls are very specific in rats (Sirotin et al., 2014), which might be a reflection of the affective state (Frysztak and Neafsey, 1991). Affective state also profoundly influences human speech production (Murray and Arnott, 1993; Bachorowski and Owren, 1995). Rat USVs have distinct ethological and neurophysiological correlates (Brudzynski, 2009). For that reason, the question of the functional relevance of producing calls with different affective significance in the same context and concatenating them remains open. However, the observation that higher rates of 22 kHz calls and composite calls are associated with lower overall call rate, might reflect an ambivalent state in the sender.
The ability of mammals to concatenate different calls within a single breath is poorly understood. Some species can produce repetitions of one call type during the same exhalation. Bats (Rhinolophus ferrumequinum) generate increasing numbers of echolocation calls per breath (the so-called feeding buzz) as they approach prey (Rübsamen and Betz, 1986). Squirrel monkeys (Saimiri sciureus) produce bouts of short stereotypic peeps during a continuous expiratory movement (Häusler 2000). The only well-studied example is human speech. Utterance length in human speech refers to the number of syllables or words produced in one breath. The average number of syllables in conversational speech increases with age in response to body size and because the maturing human learns to better coordinate speech breathing (Huber and Strathopolous, 2015). Our finding contradicts the suggestion that only humans are capable of concatenating different vocal types during single breaths (MacLarnon and Hewitt, 2004). Results in rats now suggest that breathing can be adjusted to concatenate different 50 kHz calls into single breaths (Riede, 2014) as well as 22 kHz calls and 50 kHz calls together.
The combined 22 kHz call and 50 kHz trill composite and the removal of its motor gesture from a rat's vocal repertoire as the animal matures has not been reported previously despite many descriptions of the vocal repertoire in rats. A few studies have mentioned that both call types are sometimes produced simultaneously (Geyer and Barfield, 1978; Barfield et al., 1979; Vivian and Miczek, 1991; Barker et al., 2014; Burgdorf et al., 2000); however, from early studies to today (e.g. Sales and Pye, 1974; Brudzynski, 2009; Wright et al., 2010), 22 and 50 kHz calls had been described as discrete call categories. The distinct usage of 22 kHz and 50 kHz calls in adult animals could occur subsequent to experience. We observed that sexually naïve males produced 22 kHz calls and 50 kHz calls, but none of the sexually experienced siblings did. Similar to alarm calls in other mammals (Mateo and Holmes, 1997), 22 kHz calls in rats are not innately recognized as a distress signal (e.g. Wöhr and Schwarting, 2010; Endres et al., 2007; Bang et al., 2008) and 50 kHz vocal patterns are also influenced by experience (Wöhr et al., 2008).
The combination of both call types into a single utterance had been overlooked, probably because the spectrographic image of a composite call could also be interpreted as two separate calls from two animals calling in close succession, highlighting the importance of investigating vocal motor control. The rigorous assignation of calls to specific animals in a group of calling individuals remains a challenge (Janik et al., 2000; Gill et al., 2015; Neunuebel et al., 2015), but a comprehensive recording of subglottal pressure and/or EMG activity together with sound allows an unambiguous interpretation.
Based on spectrographic analysis, previous research has suggested that many animals concatenate simple calls into more complex calls (e.g. Arnold and Zuberbühler, 2008, 2012; Ghazanfar et al., 2001; Jansen et al., 2012; Bohn et al., 2013). Unfortunately, spectrographic analysis is not conclusive on whether two different spectrographic patterns are based on different laryngeal motor patterns or whether they are produced during a single or two subsequent breaths. The nonlinear characteristics of the laryngeal sound source itself (Herzel et al., 1994; Kobayasi et al., 2012) and its nonlinear dependence on subglottal pressure (Titze, 1989; Riede, 2013) can facilitate complex vocal sounds, and minibreaths could separate two call types and still produce the spectrographic appearance of a single utterance (Hartley and Suthers, 1989; Franz and Goller, 2002). Quantification of vocal movements eliminates this ambiguity.
Rats generate a large vocal repertoire by combining and re-combining a small number of simple call types (Riede, 2014; this study). We found small differences between 22 kHz and 50 kHz calls produced alone or in combination. If spectral and temporal features of the composite calls are distinct from the respective calls produced during separate breaths, this could be due to changes in glottal airflow and rapid reconfiguration of the laryngeal valve required to produce the composite calls. The quick transition between two patterns in the composite call may cause an acoustic structure that is distinct from that required to produce each component in isolation. Important to note is that the values for call duration, fundamental frequency and trill rate in 22 kHz, 50 kHz and 22 kHz-trill (Tables 2 and 3) fell within normal ranges reported for 22 kHz calls (Brudzynski et al., 1993) and for 50 kHz calls (Wright et al., 2010), respectively. A possibly related phenomenon referred to as co-articulation is known in human speech production (e.g. Ostry et al., 1996), where a clearly detectable acoustic change occurs to speech sound A when it is concatenated with speech sound B. The connection of individual speech sound movements into one smooth whole causes adjustments reflected in the phenotypic readout.
Results of this study provide some information about how motor control of vocal repertoires may have evolved. Most vocalizations in mammals are facilitated by complex movements of the vocal organ and the respiratory system (e.g. Smotherman et al., 2006; Jürgens, 2009). The nervous system can more effectively control complexity by a hierarchical organization of movements (Giszter, 2015). Combining a number of simple calls into a repertoire consisting of simple and composite call types can be an efficient way of communication, in contrast to a system in which each signal has a distinct form (e.g. Scott-Phillips and Blythe, 2013; Nowak et al., 2000; Yip, 2006). The use of vocal motor primitives as building blocks for a complex vocal repertoire could be the basis of complex vocal patterns in other mammals as well.
Rodent vocal behavior is very diverse and covers a large spectral range. Many questions from morphology to function and neuromuscular control are still unanswered. The mechanisms of vocal production of rodent vocal behavior continue to present many interesting opportunities to study comparative and evolutionary questions of vocal communication.
Conceptualization: C.H. and T.R.; Methodology: C.H., M.S., T.R.; Investigation: C.H., M.S., T.R.; Writing, review and editing: T.R., M.S., C.H.
C.H. was supported by a student research fellowship from the College of Veterinary Medicine at Midwestern University.
The authors declare no competing or financial interests.