Rodent diversification is associated with a large diversity of species-specific social vocalizations generated by two distinct laryngeal sound production mechanisms: whistling and airflow-induced vocal fold vibration. Understanding the relative importance of each modality to context-dependent acoustic interactions requires comparative analyses among closely related species. In this study, we used light gas experiments, acoustic analyses and laryngeal morphometrics to identify the distribution of the two mechanisms among six species of deer mice (Peromyscus spp.). We found that high frequency vocalizations (simple and complex sweeps) produced in close-distance contexts were generated by a whistle mechanism. In contrast, lower frequency sustained vocalizations (SVs) used in longer distance communication were produced by airflow-induced vocal fold vibrations. Pup isolation calls, which resemble adult SVs, were also produced by airflow-induced vocal fold vibrations. Nonlinear phenomena (NLP) were common in adult SVs and pup isolation calls, suggesting irregular vocal fold vibration characteristics. Both vocal production mechanisms were facilitated by a characteristic laryngeal morphology, including a two-layered vocal fold lamina propria, small vocal membrane-like extensions on the free edge of the vocal fold, and a singular ventral laryngeal air pocket known as the ventral pouch. The size and composition of vocal folds (rather than total laryngeal size) appears to contribute to species-specific acoustic properties. Our findings suggest that dual modes of sound production are more widespread among rodents than previously appreciated. Additionally, the common occurrence of NLP highlights the nonlinearity of the vocal apparatus, whereby small changes in anatomy or physiology trigger large changes in behavior. Finally, consistency in mechanisms of sound production used by neonates and adults underscores the importance of considering vocal ontogeny in the diversification of species-specific acoustic signals.
Rodents produce diverse acoustic signals over a wide spectral range using at least two different mechanisms (e.g. Fernández-Vargas et al., 2022) (Fig. 1). Vocalizations are produced by airflow-induced vocal fold vibrations at the lower end of the spectral range and a laryngeal whistle is produced at the upper spectral range (Roberts, 1975; Riede, 2011; 2013; Pasch et al., 2017). The former mechanism is used by almost all mammals documented to date (e.g. Madsen et al., 2012; Koda et al., 2015), wherein glottal airflow draws vocal fold tissue into vibration to generate pressure fluctuations perceived as sound. In contrast, the laryngeal whistle is a unique innovation in rodents (Fernández-Vargas et al., 2022). High-frequency (or ‘ultrasonic’) whistles are produced by a glottal airstream that interacts with a rigid intralaryngeal structure, generating pressure fluctuations that resonate inside the laryngeal airway (Riede et al., 2017; Håkansson et al., 2022). One hypothesis for the origin of this innovation is to escape detection of acoustically orienting predators (Blanchard and Blanchard, 1989; Brudzynski, 2014). Regardless of the process, the distribution of whistling among rodents remains largely unknown. A better understanding of the importance of species-specific acoustic variation requires characterization of the underlying sources of such variation.
The monophyletic rodent genus Peromyscus (Cricetidae, Neotominae) provides a model system to study the evolution of adaptive divergence (Bedford and Hoekstra, 2015), including vocal communication (Kalcounis-Rueppell et al., 2018a). ‘Deer mice’ have evolved diverse vocal repertoires used in a variety of social contexts (Eisenberg, 1961; Miller and Engstrom, 2010, 2012; Briggs and Kalcounis-Rueppel, 2011; Kalcounis-Rueppell et al., 2006, 2010, 2018b; Pultorak et al., 2015). Three call types found in adult Peromyscus include sustained vocalizations (hereafter SVs), simple and complex sweeps (hereafter ‘sweeps’) and barks (sometimes referred to as screams). SVs consist of one or more syllables of short duration (∼200 ms) uttered in close succession, and their fundamental frequency ranges between 10 and 25 kHz. Sweeps are vocalization of short duration (10–50 ms) with fundamental frequencies above 25 kHz (Kalcounis-Rueppell et al., 2018b). Barks are 50–100 ms vocalizations with a fundamental frequency range between 0.8 and 6 kHz. A fourth call type are pup isolation vocalizations produced by offspring during the first 3 weeks of life. Newborn deer mice produce such characteristic vocalizations when isolated from their mother (Hart and King, 1966; Smith, 1972; Johnson et al., 2017; Kalcounis-Rueppell et al., 2018c). Pup isolation calls resemble adult SVs in spectral and temporal features (Johnson et al., 2017; Kalcounis-Rueppell et al., 2018c).
Despite the extensive repertoire and radiation of Peromyscus (Kalcounis-Rueppell et al., 2018a), no studies have characterized mechanisms of vocal production. However, the presence of nonlinear phenomena (NLP) in the SVs of P. californicus (Miller and Engstrom, 2012) indicates that such vocalizations are produced by airflow-induced vocal fold vibrations (Herzel et al., 1994, 1995). NLP result from irregular vibration patterns of the vocal folds and are commonly found in human and nonhuman mammals (e.g. Wilden et al., 1998; Riede et al., 1997, 2000; Blumstein and Récapet, 2009; Titze et al., 2008). NLP may be indicative of arousal (Blumstein and Récapet, 2009), predictability (Townsend and Manser, 2011) and/or provide signatures of individual or species identity by indicating the maximum fundamental frequency at which vocal folds can perform symmetric harmonic vibrations (Riede et al., 2007).
Acoustic properties are determined by vocal organ size, vocal fold composition, airway geometry, coordination of vocal organ and breathing movements (Fernández-Vargas et al., 2022). Understanding the relative contributions of these morphological and physiological traits to acoustic variation could provide insight into the evolution of species-specific vocalizations. In this study, we used light gas experiments and acoustic analyses to identify the distribution of the two mechanisms among six species of deer mice. In addition, we qualitatively described the anatomy of vocal organs to inform laryngeal biomechanics. Finally, we compared vocal organ and laryngeal airway size as well as vocal fold size and composition among the six species to better understand determinants of species-specific acoustic properties.
MATERIALS AND METHODS
A total of 74 individuals from six Peromyscus species were included in this study. Individuals of five species were acquired from the Peromyscus Genetic Stock Center and one species was wild-captured [Peromyscus truei (Shufedlt 1885)]. Live Peromyscus californicus (Gambel 1848) and Peromyscus maniculatus (Wagner 1845) were purchased and bred at Midwestern University, Glendale, AZ. Twelve adult animals (6/sex) from each of the two species were investigated through sound recordings, heliox experiments and anatomical analysis. Additionally, 6 pups (P. californicus) were investigated through sound recordings (N=6) and heliox experiments (N=4). Twelve specimens (6/sex) from each of Peromyscus polionotus (Wagner 1843), Peromyscus eremicus (Baird 1858) and Peromyscus leucopus (Rafinesque 1818) were purchased from the Peromyscus Genetic Stock Center (PGSC) for anatomical analyses.
Twenty P. truei were captured near Deadman Flat, 28 km north of Flagstaff, AZ, using Sherman live-traps baited with sterilized bird seed and transferred in standard mouse cages to animal facilities at Northern Arizona University, Flagstaff, AZ, for sound recordings. Twelve animals (6/sex) were transferred to Midwestern University, Glendale, AZ, USA, for heliox experiments and morphological analysis.
All procedures were performed in accordance with ethical standards and approval of the Institutional Animal Care and Use Committee at Midwestern University (MWU#3011) and Northern Arizona University (19-006) and guidelines of the American Society of Mammalogists (Sikes and Animal Care and Use Committee of the American Society of Mammalogists, 2016). Animals were captured with a permit from the Arizona Game and Fish Department (607608).
Recording vocal behavior in light gas atmosphere, examination of nonlinear phenomena and anatomical investigation of laryngeal tissue can be used to inform the sound production mechanism used to produce four types of vocalizations. For light gas experiments, a vocalizing animal was placed in a closed container with a gas mixture that has a lower density than normal air. The approach can differentiate between the two vocal production mechanisms. The vibration frequency of vocal folds is independent of the type of gas that surrounds them (Titze et al., 2016), i.e. the fundamental frequency of the sound does not change in light gas. However, the velocity of the sound wave is faster in the light gas and the fundamental frequency of a whistle sound therefore increases predictably (Roberts, 1975; Riede, 2011; Pasch et al., 2017; Riede and Pasch, 2020).
Acoustic recordings in a light gas atmosphere were successfully conducted in P. californicus (pup isolation calls, adult SVs and sweeps) and P. maniculatus (adult barks and sweeps). Individual mice were placed in an acrylic cage. The cage was equipped with bedding, food and water. Heliox gas (80% He, 20% O2) was injected into the cage at flow rates between 20 and 40 l min−1 through a 12 mm wide tube placed into the cage wall near the floor. Predicted acoustic effects of light gas concentrations were estimated with a small whistle placed at the floor of the cage and connected externally by a silastic tube. The whistle was blown and recorded at regular intervals to monitor the heliox concentration. The ratio of the frequency of the whistle in air and in heliox allowed an estimation of the expected effect for any given heliox concentration.
Heliox experiments indicated that SVs in P. californicus are generated by flow-induced vocal fold vibrations (see Results). Since SV calls among Peromyscus species show similar spectro-temporal features (Kalcounis-Rueppell et al., 2018b), we inferred that SV calls in other Peromyscus species were generated by the same mechanism and thus focused on the occurrence of harmonic patterns and nonlinear phenomena (NLP) that typify sounds produced by vocal fold vibration. In order to determine the occurrence of NLP in SV calls, we intensively sampled vocalizations of two species (P. californicus and P. truei). P. californicus vocalizations were recorded using an ultrasonic microphone (Avisoft-Bioacoustics, CM16/CMPA-5V) placed over the center of the cage. Microphone frequency range is 2 to 200 kHz and an approximate sensitivity of 500 mV Pa−1. Signals were acquired through an NiDAQ 6212 acquisition device, sampled at 200 kHz, and saved as uncompressed files using Avisoft Recorder software (version 3.4.2, Avisoft-Bioacoustics, Berlin, Germany). For P. truei, singly housed mice in their home cage were placed in semi-anechoic coolers lined with acoustic foam. We used ¼′′ (6.4 mm) microphones (Type 40BE, G.R.A.S.) connected to preamplifiers (Type 26 CB, G.R.A.S.) to obtain recordings above the center of the mouse cage. Microphone response was flat within ±1.5 dB from 10 Hz to 50 kHz, and pre-amplifier response was flat within ±0.2 dB from 2 Hz to 200 kHz. Microphones were connected to a National Instruments Data Acquisition unit (USB 4431) sampling at 102.4 kHz to a desktop computer running a custom recording program in MATLAB (v. 2018a).
Micro-CT scanning and histology
Twelve adult mice per species (6/sex) were euthanized with ketamine and xylazine, and then transcardially perfused with saline solution followed by 10% buffered formalin. Larynges were dissected and placed in 10% buffered formalin phosphate (SF100-4; Fisher Scientific) for 2 days.
Larynges from eight mice (4/sex) were x-rayed at 5 µm resolution. First, tissues were transferred from the formalin solution to 99% ethanol. Tissues were then stained in 1% phosphotungstic acid (PTA) (Sigma Aldrich, 79690) in 70% ethanol. After 5 days, the staining solution was renewed and the tissue was stained for an additional 5 days. After staining, specimens were placed in a custom-made acrylic tube and scanned in air. Micro-CT scanning was done using a Skyscan 1172 (Bruker). Reconstructed image stacks were then imported into AVIZO software (v. Lite 9.0.1). Laryngeal cartilages and the border between the airway and soft tissues of the larynx were traced manually in CT scans. This approach provided outlines of the cartilaginous framework and the airway. Derived 3D surfaces of eight specimens from each of the 6 species have been archived at Morphobank (O'Leary, Kaufman 2012), project # P4106.
Coronal histological sections of larynges from four mice (2/sex) were used to quantify vocal fold morphology (lamina propria thickness and fibrillar protein distribution). Mid-membraneous coronal sections (5 mm thick) were stained with Haematoxylin and Eosin for a general overview, Masson's Trichrome (TRI) for collagen fiber stain and Elastica-Van Gieson (EVG) for elastic fiber stain. Sections were scanned with an Aperio CS 2 slide scanner and processed with Imagescope software (v. 220.127.116.113; Aperio Tech.).
Four call types were successfully recorded in heliox and normal air: sweeps, SV calls, barks and pup isolation calls. All four call types were analyzed for center, minimum (f0,min), maximum (f0,max) fundamental frequency and call/syllable duration. Fundamental frequency was quantified every 20 ms using PRAATs pitch-tracking tool. Then, frequency values were represented as histograms (100 or 500 Hz resolution). Center fundamental frequency was calculated from the weighted median of all frequency measurements. Fundamental frequency range was calculated from the difference between f0,min and f0,max. Acoustic differences between normal air and heliox songs were assessed with paired t-tests.
Vocalizations were analyzed using the pitch tracking tool (1024-point Fast Fourier Transform (FFT), 75% frame size, Hann window, frequency resolution 100 Hz, temporal resolution 93.75%, 0.625 ms) in the software PRAAT (v. 5.3.80, retrieved January 2014 from http://www.praat.org/). Call duration, maximum fundamental frequency, and minimum fundamental frequency were manually extracted. NP were first categorized into frequency jumps (FJs), subharmonics (SHs), deterministic chaos (CH) or biphonation (BP) (Fig. 2). NP were quantitatively analyzed in P. californicus (n=4/sex) and P. truei (n=6 females, 3 males) based on visual inspection of a narrowband spectrogram of the signal (Herzel, 1993) and associated Fourier frequency spectra, following earlier studies (Riede et al., 2000, 2004; Titze et al., 2008; Zollinger et al., 2008). First, different temporal segments of an SV call were determined. Segment borders were positioned at bifurcations. A bifurcation refers to the boundaries between different regimes, such as ‘no phonation’, harmonic phonation, SHs, BP, CH and FJs (e.g. Riede et al., 2000, 2004). Occurrence of each NLP relative to number of SV bouts and syllables was measured. Duration of syllable and NLPs, as well as percentage occurrence was calculated.
Laryngeal morphology and vocal fold histology
Laryngeal anatomy was investigated by focusing on three aspects of the vocal organ. First, to test whether overall organ size served as a proxy for laryngeal valve function, we quantified thyroid cartilage (whole organ) centroid size and vocal fold length (laryngeal valve). Second, previous work suggested that a small pocket (i.e. the ventral pouch), plays an important role in ultrasonic whistle production (Riede et al., 2017). Therefore, laryngeal airway shape was described qualitatively. Thirdly, shape and composition of vocal folds determine their biomechanical properties and vibration characteristics (Titze et al., 2016). In a related genus (Onychomys spp.), a heterogeneous lamina propria and presence of vocal membranes support the production of long-distance low frequency calls (Pasch et al., 2017). Therefore, we studied lamina propria heterogeneity and presence of vocal membranes located near the free edge of the vocal folds.
Size was described by centroid size and vocal fold length in 48 specimens (8 per species and 4 per sex; P. californicus, P. maniculatus, P. leucopus, P. eremicus, P. polionotus, P. truei). Geometric morphometric methods were developed previously and are outlined in detail in Borgard et al. (2020) and Riede et al. (2020). Briefly, the Geomorph package (v. 3.0.5.; https://CRAN.R-project.org/package=geomorph) for R (https://www.r-project.org/) was used to measure thyroid cartilage centroid size as a proxy of overall larynx size. Landmarks (24 curve landmarks and 100 semi-landmarks) were placed on 3D surface renderings of the thyroid cartilage. Centroid size was calculated as the square root of the sum of squared distances of each landmark from the center of the cartilage (Zelditch et al., 2004). Vocal fold length was measured between the most ventral tip of the vocal process of the arytenoid cartilage and the midline thyroid cartilage near its caudal edge. Body size was estimated through body mass and left femur length. Body mass and femur length were found to be strongly positively correlated (Pearson correlation, N=72; r=0.81; P<0.001). Body mass (F2,69=9.3; P<0.001) but not femur length (F2,69=2.1; P=0.13) was different between males and females. Therefore, males and females were combined for analyses and femur length was used as body size estimate.
In 24 specimens (4/species; 2/sex), lamina propria thickness was measured and averaged across 3 locations positioned equidistant under the non-ciliated epithelium. In 18 (out of 24) specimens, we found vocal membrane-like structures. The height and width of vocal membranes were measured. All measurements were taken in mid-membranous coronal sections using software ImageJ. We used multiple regression to assess whether anatomical measures could be predicted from body size and/or species identity.
Collagen and elastin content of the lamina propria was quantified by digitally isolating the lamina propria of the free edge of the vocal fold. First, we drew an imaginary line bisecting the lamina propria into superficial (medial) and deep (lateral) halves. In trichrome stains, the blue-staining collagen fiber pixels within the blue range were selected using the color threshold tool in ImageJ. The image was then converted into binary mode that converted blue pixels into black and all other pixels into white. The proportion of black pixels was counted in each of five transects placed into the superficial and the deep lamina propria. Care was taken so that transects would not overlap between deep and superficial lamina propria or reach into the epithelial tissue. Black-staining elastin fibers were similarly quantified using the brightness slider in ImageJ.
In order to determine the mechanism of sound production in Peromyscus spp., we recorded adult SVs, sweeps, barks and pup isolation calls in air and in light gas atmosphere. Fig. 2 shows spectrographic representations of sweeps and SV bouts produced in air and in heliox. In adult P. californicus, fundamental frequency of sweep calls increased in heliox (paired t-tests, f0,center: t7=−4.86, P<0.001; f0,min: t7=−5.75, P<0.01; f0,max: t=−6.6, P<0.001) compared with normal air (Table 1). However, sweep call duration did not change in heliox (t7=0.21, P=0.84) compared with normal air (Table 1; Fig. 2D). In contrast, fundamental frequency of SV syllables did not change in heliox (paired t-tests, f0,mean: t3=−0.34, P=0.75) compared with normal air (Table 2). Similarly, SV duration also did not change in heliox (t3=0.33, P=0.76) compared with normal air (Table 2).
Similarly to adult SVs, fundamental frequency of pup isolation calls did not change in heliox (paired t-tests, f0,mean: t3=2.67, P=0.076) compared with normal air (Table 3), nor did their duration (t3=1.70, P=0.19; Table 3).
In P. maniculatus, we similarly found that fundamental frequency of sweep calls increased in heliox (paired t-tests, f0,center: t3=−7.01, P<0.01; f0,min: t3=−4.36, P<0.05; f0,max: t3=−4.87, P<0.01) (Table 2; Fig. 2E), but their duration did not (t3=2.02, P=0.11; Table 2). Fundamental frequency of barks did not change in heliox (paired t-tests, f0,mean: t2=1.51, P=0.27) compared with normal air (Table 2). Barks were shorter in heliox than in normal air (t2=9.8, P<0.05) (Table 2).
In sum, the data suggest that both adult P. californicus and P. maniculatus produce sweeps by a whistle mechanism. Barks, SVs and pup isolation calls, however, are produced by airflow-induced vocal fold vibration.
Fig. 3 shows examples of four types of NLP in SV bouts produced by P. californicus and P. truei. In P. californicus, the percentage of SVs containing at least one type of NLP within individuals ranged between 1.3 and 48.3% (Table 3). Subharmonics were present in the SVs of 6 out of 7 mice, and the number of calls containing subharmonic segments ranged widely among individuals (0 to 44.8%) and appeared to be individual specific (Table 3). Cumulatively across all P. californicus, out of 785 SVs that exhibited NLPs, 34 SVs (4.3%) displayed one or more frequency jumps, 95 SVs (12.1%) contained one or more subharmonic segments, 3 SVs (0.4%) had one or more chaotic segments and 14 SVs (1.8%) exhibited biphonation. We found 8 calls (1.0%) that exhibited different combinations of two types of NLP. NLP duration ranged between 26 and 48% of call duration (Table 3), indicating substantial variation in calls among mice. We also screened 20 calls from each of six 2-day-old pups and found that a proportion of 0 to 90% of calls contained NLP.
In P. truei, females vocalized extensively during social isolation, while males produces very few vocalizations (Table 4). The percentage of SVs containing at least one type of NLP within individuals ranged between 0.7 to 48% (Table 4). Similarly to P. californicus, P. truei produced more subharmonics than other NLP types, with a wide range of within-individual variability (0 to 40.7%, Table 4). Cumulatively across all P. truei, out of 3052 SV syllables that contained NLP, 242 SVs (7.9%) had frequency jumps, 1410 SVs (46.2%) contained one or more subharmonic segments, 279 SVs (9.1%) had one or more chaotic segments, and 14 SVs (0.5%) exhibited biphonation. Interestingly, P. truei produced more calls (650 calls, 21.3%) that contained >2 NLP types relative to P. californicus. Lastly, we discovered that duration of NLPs varied widely among individuals (3.9 to 97%).
Thyroid cartilage centroid size was correlated with body size among and within species (F2,45=100.5; P<0.001). Vocal fold length (measured as distance between vocal process of the arytenoid cartilage and attachment to the interior of the thyroid cartilage) ranged between 595 and 741 µm in the largest of the six species (P. californicus) and between 619 and 817 µm in the smallest of six species (P. polionotus) (Table 5). Vocal fold length was not predicted by body size neither within nor among species (F2,45=0.587; P=0.56).
A ventral pouch was present in all 48 individuals investigated by micro-CT imaging (Fig. 4). The pouch is positioned medially and rostral from the vocal folds. The air pocket is separated from the main laryngeal airway by a constriction consisting of alar cartilage. The ventral pouch is surrounded by the thyroid cartilage (Fig. 4). Histological images suggest that the alar cartilage is connected to a branch of the thyroarytenoid muscle which regulates the distance between the glottis and alar edge.
Vocal fold consisted of the thyroarytenoid muscle, lamina propria and epithelium. The thickness of the lamina propria measured between 79 and 133 µm (Table 5) in the six species and was not associated with body size within or among species (F2,45=0.339; P=0.717). Collagen and elastin fibers were present but not homogeneously distributed in the lamina propria (Fig. 5). Protein density was measured within transects positioned either deep (more laterally) or superficial (in the medial lamina propria). A higher density for both proteins was found in the superficial layer. The ratios between superficial and deep lamina propria for each protein were greater than zero (Fig. 5C,D).
Vocal membranes were present in all four individuals of P. californicus, P. leucopus and P. maniculatus, but only in 2 of 4 P. eremicus, 1 of 4 P. polionotus, and 3 of 4 P. truei (Fig. 5E). In all cases, vocal membranes were positioned symmetrically on both vocal folds. Vocal membranes consisted of lamina propria and an epithelial layer that were single-lobed or consisted of two or more lobes. For example, Fig. 5B shows single lobes for P. californicus, P. eremicus, P. leucopus and P. maniculatus, multiple small lobes for P. polionotus and two lobes for P. truei.
Height and width of vocal membrane-like structures among six species ranged between 29 and 55 µm (height) and from 17 to 37 µm (width) (Fig. 5F,G). We tested whether vocal membrane size scaled with body size within or among species. Height was associated with body size among but not within species (F2,45=6.71; P<0.01). Width was not associated with body size either among or within species (F2,45=3.244; P=0.067).
Here, we investigated the biology of sound production mechanisms used in acoustic interactions among closely related species of deer mice. Like other cricetid rodents, we found that deer mice use two distinct production mechanisms: whistling to produce high frequency vocalizations in close-distance contexts, and airflow-induced vocal fold vibrations to produce SVs and isolation calls used in longer distance communication. The common occurrence of NLP in adult SVs and in pup isolation calls support the finding that sounds are generated by flow-induced vocal fold vibrations and that such vibrations may be irregular. As in other muroid rodents, a characteristic ventral pouch likely facilitates whistle production, and small vocal membranes arising from a two-layered lamina propria presumably underlies SV production. Species differences in vocal fold size and composition may contribute to species-specific acoustic properties. We discuss our findings in relation to functional, ontogenetic and evolutionary factors that may influence the diversification of rodent acoustic signals.
Sound production mechanisms and social context
The dual sound production mechanisms found herein correspond to the functional context of vocalizations. Pup isolation calls and adult SVs both function to advertise the sender's presence to conspecifics over distances greater than a body length, either to absent parents (Rieger et al., 2019) or potential mates or rivals (Kobrina et al., 2022; Pultorak et al., 2017; Rieger and Marler, 2018). Employment of flow-induced vocal fold vibration facilitates production of sounds across a wide frequency range at high amplitudes, both acoustic features exhibit reduced environmental attenuation (Wahlberg and Larsen, 2017). In contrast, all Peromyscus spp. simple and complex sweeps were produced by a whistle mechanism. Such low amplitude, high frequency vocalizations are often produced in close-distance (<body length) social contexts where environmental attenuation is less important. Our results correspond to findings in grasshopper mice (Onychomys spp.; Pasch et al., 2017), the sister taxon to Peromyscus, indicating that such dual production mechanisms may accommodate similar social contexts in many muroid rodents.
These findings also underscore the utility of cricetid rodents (e.g. Onychomys spp. and Peromyscus spp.) in providing new avenues to explore vocal fold form and function, the relationship between physiological and environmental factors in vocal diversification, and commonalities with human speech. Unlike traditional rodent models (Mus spp. and Rattus spp.) that whistle when they vocalize (Roberts, 1975; Riede, 2011; Riede et al., 2017; Håkansson et al., 2022; Fernández-Vargas et al., 2022; Fig. 1), human studies of flow-induced vocal fold vibration highlight the integration of precise motor control, somatosensory feedback and tissue properties in speech production (e.g. Titze, 1988; Steinecke and Herzel, 1995). In particular, the mechanical demands (linear, shear and impact stress) acting on vocal fold epithelium and lamina propria during speech (Riede et al., 2011; Titze et al., 2016) indicate that strong inferences will require further study of vocal fold use and aging.
Heliox data indicated that Peromyscus spp. SV calls are produced by flow-induced vocal fold vibration. A characteristic feature of such vocalizations are NLP (e.g. Herzel et al., 1994; Tokuda, 2017). NLP were frequently present in SVs and pup isolation calls of P. californicus and in SVs of P. truei, sometimes occurring in over 50% of vocalizations produced by some individuals. Mechanistically, NLP can arise from asymmetries in vocal fold size (e.g. Tokuda et al., 2007) or nonlinear interactions between the sound source and vocal tract filter resonance (e.g. Titze et al., 2008). Additionally, vocal membrane-like structures (below) may contribute to the nonlinear dynamics of vocal production (Mergell et al., 1999). Mergell et al. (1999) described the vocal membrane as an additional reed-like plate fixed to the vocal fold. Neubauer (2004) updated Mergell's model by allowing the vocal membrane to vibrate independently from the vocal fold. In both models, the addition facilitates higher fundamental frequencies (Mergell et al., 1999; Neubauer, 2004). Such high vibration rates may also promote irregular vibrations that characterize NLP. Experimental manipulation of vocal membrane presence, size and/or symmetry would provide strong inference for their contribution to NLP.
Numerous functional hypotheses have been proposed to explain the presence of nonlinear phenomena in vocalizations. In rodent pups, NLP could serve as an honest signal of distress used to recruit older conspecifics to fend off predators (Blumstein et al., 2008). In adults, NLP may facilitate receiver arousal and fear, consequently increasing predator vigilance and decreasing habituation to alarm calls (i.e. unpredictability hypothesis; Blumstein and Récapet, 2009). NLP may also be associated with personality (Lee et al., 2021) and play a role in individual recognition and discrimination (e.g. Wilden et al., 1998; Volodina et al., 2006). Conversely, NLP may represent non-functional side-effects of vocal disorders (e.g. Tokuda et al., 2001.), overuse (e.g. Vilkman, 2004) or age (e.g. Baken, 2005; Marx et al., 2021). Given increased documentation of NLP in rodent vocalizations (Blumstein et al., 2008; Miller and Engstrom, 2010, 2012), the significance of NLP awaits further experimentation.
Peromyscus spp. vocal folds possess a narrow lamina propria with a characteristic organization of fibrillary proteins, suggesting differentiation into a superficial and a deep layer. The lamina propria together with the epithelium forms vocal membranes near the free edge of the vocal fold. Flow-induced vocal fold vibration is characterized by phase differences in tissue movement between the upper (cranial) and lower (caudal) portions of the lamina propria. In Peromyscus spp. (this study) as well as in another cricetid rodent genera, Onychomys (Pasch et al., 2017), vocal membrane-like structures likely support such cranial–caudal phase difference owing to their position in the laryngeal lumen that creates a concave-shaped vocal fold surface in the coronal plane (Thomson et al., 2005). Vocal fold design is critical for multiple aspects of voice characteristics. Investigations of a nonhuman primate larynx suggested that vocal membranes facilitate sound production at higher efficiency, i.e. greater loudness for a given lung pressure (Zhang et al., 2019). Species-specific lamina propria morphology also defines a characteristic fundamental frequency range determined by compressional (lateral) and tensile (along the length) stiffness of the collagen and elastin fibers in the superficial layer (Titze et al., 2016). Indeed, our data provide preliminary support for a relationship among fundamental frequency, vocal fold size and collagen elastin consistency, since all vary among species (Table 1, Fig. 5), after controlling for body size (Fig. 4). Formal analysis of causal relationships among these variables is currently under investigation.
Interestingly, vocal membranes were not present in every adult investigated in this study (Table 5). Could such features be artifacts of faulty tissue removal, fixation, tissue processing, embedding, microtomy, staining and mounting procedures? Both historical and more recent reviews of common histological artifacts did not include structures that resemble small folds protruding from the lamina propria and epithelium if the underlying tissue is fully intact (i.e. not torn or cut) (e.g. Mehregan and Pinkus, 1966; Kumar et al., 2012; Taqi et al., 2018). Therefore, until in vivo observations of mouse vocal folds become available, we infer that histological images of vocal membranes represent the in vivo situation of the free edge of the mouse vocal fold. It is unclear whether vocal membranes are normal variations of vocal fold anatomy or a consequence of stresses and strains associated with use. Vocal fold nodules in humans, like vocal membrane-like structures in Peromyscus spp., are bilateral, symmetrical structures (e.g. Hirano et al., 1990; Ford et al., 1996; Glanz et al., 1997; Giovanni et al., 2007). Humans with nodules and other lesions may experience voice irregularities, which are nonlinear phenomena (Baken, 2005). Although vocal fold lesions remain an idiopathic disease, i.e. a disease with unknown cause, they tend to be more common among people using their voice professionally (teachers, actors, singers etc.) (e.g. Vilkman, 2004). In Peromyscus spp., future studies that characterize the ontogeny of acoustic properties coincident with developmental changes of vocal folds will elucidate the functional morphology of their vocal folds.
Social origin of vocalization is paralleled by vocal production mechanisms
Isolation calls that solicit maternal attention and care (Wöhr and Schwarting, 2008) often serve as precursors to adult vocalizations used in other social contexts (e.g. Oller et al., 2016; Pistorio et al., 2006; Matrosova et al., 2007). For example, adult Mus spp. and Rattus spp. vocal repertoires used in mating contexts likely emerge from pup vocalizations (Hofer, 2010; Brudzynski, 2014) based on spectro-temporal similarities (Wöhr and Schwarting, 2008; Hofer, 2010). Similarly, spectro-temporal similarities occur between pup isolation vocalizations and adult SVs in P. californicus (Johnson et al., 2017). Both consist of bouts of 2–4 syllables with fundamental frequency ranges between 25 and 30 kHz, only slightly above the 15–24 kHz range for adult SV calls. Our findings confirm this observation by noting similarities in the duration of individual syllables in pup isolation calls (50–200 ms; Johnson et al., 2017; Kalcounis-Rueppell et al., 2018c) and adult SVs (this study; Table 1).
Importantly, we found consistency in the sound production mechanism between pup calls and adult SVs in Peromyscus spp.: both were produced using flow-induced vocal fold vibration. Stability in production also occurs in Mus spp. and Rattus spp. pup and adult vocalizations, albeit using a whistle mechanism (Roberts, 1975; Riede, 2011). Such developmental stability may constrain the evolution of rodent vocalizations, both contextually and in acoustic content. For example, while Mus spp. and Rattus spp. rely almost exclusively on ultrasonic whistling for social communication, many cricetids appear to whistle only as adults (although Peromyscus spp. pups occasionally produce high frequency whistles). Similarly, many pup isolation calls show surprising spectral overlap with adult vocalizations, which is puzzling because size-dependent spectral properties would dictate a more dramatic decrease in fundamental frequency. Together, our finding highlights the need for further comparative studies that specify the ontogeny and mechanisms of vocal repertoires, including the origins of whistling. At the very least, our results challenge the predator escape hypothesis given that altricial Peromyscus spp. pups produce relatively low frequency calls audible to predators at their most vulnerable life stage.
We thank Laura Bone, Ryan Brzozowski and Mariah Letowt for assistance with animal husbandry at Northern Arizona University, and David Hendershott for assistance with trapping P. truei in the field.
Conceptualization: T.R., B.P.; Methodology: T.R.; Software: T.R.; Validation: T.R.; Formal analysis: T.R., A.K., L.B., T.D., B.P.; Investigation: T.R.; Resources: T.R.; Data curation: T.R., A.K., L.B., T.D., B.P.; Writing - original draft: T.R., B.P.; Writing - review & editing: T.R., A.K., B.P.; Visualization: T.R.; Supervision: T.R., B.P.; Project administration: T.R., B.P.; Funding acquisition: T.R., B.P.
This work was in part supported by the National Science Foundation (IOS #1754332 to T.R.; IOS #1755429 to B.P.) and the National Institutes of Health (5R01DC018280-02 subaward to T.R.). Deposited in PMC for release after 12 months.
Derived 3D surfaces of airways have been archived at Morphobank (project #P4106): https://morphobank.org/index.php/Projects/ProjectOverview/project_id/4106.
The authors declare no competing or financial interests.