Elephants' low-frequency vocalizations are produced by flow-induced self-sustaining oscillations of laryngeal tissue. To date, little is known in detail about the vibratory phenomena in the elephant larynx. Here, we provide a first descriptive report of the complex oscillatory features found in the excised larynx of a 25 year old female African elephant (Loxodonta africana), the largest animal sound generator ever studied experimentally. Sound production was documented with high-speed video, acoustic measurements, air flow and sound pressure level recordings. The anatomy of the larynx was studied with computed tomography (CT) and dissections. Elephant CT vocal anatomy data were further compared with the anatomy of an adult human male. We observed numerous unusual phenomena, not typically reported in human vocal fold vibrations. Phase delays along both the inferior–superior and anterior–posterior (A–P) dimension were commonly observed, as well as transverse travelling wave patterns along the A–P dimension, previously not documented in the literature. Acoustic energy was mainly created during the instant of glottal opening. The vestibular folds, when adducted, participated in tissue vibration, effectively increasing the generated sound pressure level by 12 dB. The complexity of the observed phenomena is partly attributed to the distinct laryngeal anatomy of the elephant larynx, which is not simply a large-scale version of its human counterpart. Travelling waves may be facilitated by low fundamental frequencies and increased vocal fold tension. A travelling wave model is proposed, to account for three types of phenomena: A–P travelling waves, ‘conventional’ standing wave patterns, and irregular vocal fold vibration.

Voice is an important means for communication in all kinds of mammals, including humans (Hauser, 1996; Bradbury and Vehrencamp, 1998). With a few exceptions, such as cat purring (Remmers and Gautier, 1972; Sissom et al., 1991), vocal production is governed by the physical principles of the myoelastic–aerodynamic theory: by muscle-supported, flow-driven, self-sustaining oscillations of laryngeal tissue (Van Den Berg, 1958; Titze, 2006). The mammalian sound generator exhibits a wide variety of oscillatory behaviour, including non-linear phenomena such as subharmonics and deterministic chaos (Fitch et al., 2002; Herbst et al., 2013). This complex system with multiple degrees of freedom has been well studied in humans and a few mammalian species with comparable laryngeal dimensions, such as dogs and sheep (Herzel, 1995; Svec et al., 2000; Tokuda et al., 2008; Döllinger et al., 2011).

African elephants (Loxodonta africana) are the largest terrestrial mammals. Their vocal communication is characterized by a rich repertoire of distinct sounds, spanning a fundamental frequency range from ten to several hundred hertz (Payne et al., 1986; Poole et al., 1988; Langbauer, 2000; Leong et al., 2003; McComb et al., 2003; Garstang, 2004; Soltis et al., 2005; Stoeger-Horwath et al., 2007). The elephant larynx is the largest mammalian sound generation system so far investigated. Recently, we conducted an excised larynx experiment and showed that the infrasonic vocalizations of elephants are produced by muscle-supported, flow-driven, self-sustaining oscillations, comparable to those during human voice production (Herbst et al., 2012). The fundamental frequencies of the generated sounds are determined by the dimensions of the elephant vocal folds, ranging from approximately 8 to 10 cm in length in adult animals (Kühhaas, 2011).

Here, an in-depth analysis of the experimentally observed oscillatory phenomena in elephant voice production is presented. The vibratory characteristics are discussed with respect to the specific larynx anatomy of the elephant, as well as being related to anatomical and physiological aspects of human voice production.

Larynx specimen and computed tomography scan

The larynx specimen came from a 25 year old female African elephant, L. africana (Blumenbach 1797) (body mass 2500 kg), which died of natural causes in the Tierpark Berlin in October 2010. The larynx was excised several hours post-mortem on the same day, and immediately stored at −20°C [see supplementary material in a previous publication (Herbst et al., 2012)]. The frozen specimen was shipped to the Laboratory of Bioacoustics, Department of Cognitive Biology, University of Vienna. Here, the specimen was slowly thawed over a period of 3 days, and then immediately prepared for and used in data acquisition.

Computed tomography (CT) examination was performed using a Somatom Emotion multislice scanner (Siemens AG, Munich, Germany). The specimen was placed in a ventral recumbency, and was scanned at 130 kV, 200 mA, rotation time 0.6 s, resulting in 1 mm thick slices. Images were reconstructed using OsiriX 3.7.1 64 bit software (copyright, Antoine Rosset). The anatomical measurements shown in supplementary material Fig. S1 were taken with the FIJI Image processing package (Schindelin et al., 2012).

For cross-species comparison with a well-researched model of mammalian sound production, in vivo data from a CT scan of a 42 year old human male are included in this study. These data were acquired with a CT scanner (model LightSpeed VCT, GE Medical Systems, Cleveland, OH, USA) at 420 kVp (peak kilovoltage), 200 mA, 5 s, using a helical mode and transversal slices with a thickness of 0.625 mm.

Excised larynx setup

The vocal folds were adducted by exerting a steady manual pressure on the lateral surfaces of the corniculate processes of the arytenoid cartilages, thus moving the arytenoid cartilages both medially and anteriorly. The excised elephant larynx was phonated by blowing warmed humidified air through the trachea and past the manually adducted vocal folds. Vibration of the laryngeal tissue was documented using a Casio EX-F high-speed camera positioned 52 cm above the vocal fold level [see supplementary material in a previous publication (Herbst et al., 2012) for details].

For the data reported here, two conditions of larynx preparation were evaluated: in preparation stage a, the cranial parts of the larynx (epiglottis and laryngeal vestibulum) were left intact, in order to observe the role of these structures in sound generation; in preparation stage b, these structures were removed by transverse cuts, and the remaining parts of the vestibular folds were pulled away from the glottis with sutures in order to provide an unobstructed view of the oscillating vocal folds.

Electroglottographic, acoustic and air flow data

Electroglottography (EGG) is a method used to monitor relative vocal fold contact area during phonation (Fabre, 1957). A low-intensity, high-frequency current is passed between two electrodes placed on each side of the thyroid cartilage at glottal level. The time-varying change of vocal fold contact during the flow-induced oscillation of laryngeal tissue introduces alterations into the electrical impedance across the larynx, resulting in a variation of the current between the two electrodes (Fourcin and Abberton, 1971; Baken, 1992; Baken and Orlikoff, 2000). This approach provides a method to assess the oscillation of laryngeal tissue during voice production. In this study, the EGG signal was captured with a Glottal Enterprises EG 2-1000 two-channel electroglottograph (lower cut-off frequency at 2 Hz; Syracuse, NY, USA).

Acoustic data were captured with a DPA 4061 omni-directional microphone (DPA Microphones, Alleroed, Denmark) positioned 7 cm from the vocal folds. Both the acoustic and the EGG signal were recorded with a RME Fireface 800 external audio interface (RME, Haimhausen, Germany) at a sampling frequency of 44,100 Hz. Before analysis, the signals were downsampled to 8000 Hz with the software package Cool Edit Pro 2.0 (2095.0; Syntrillium, Phoenix, AZ, USA). In order to compensate for the time delay caused by the larynx-to-microphone distance, the acoustic signal was shifted forward in time by 0.21 ms relative to the video.

The sound pressure level (SPL) was measured with a Voltcraft SL-400 sound level meter (Voltcraft, Hirschau, Switzerland), positioned 30 cm from the vibrating vocal folds. Trans-glottal air flow data was acquired with a Sensorion SDP1000-L differential pressure transducer and a F300L flow head (F-J Electronics, Vedbaek, Denmark). Both SPL and air flow data were collected with a Labjack U6 data acquisition interface (Lakewood, CO, USA) at a sampling rate of 1000 Hz. The time-series data were low-pass filtered with a 201 point moving averager.

High-speed video data analysis

In digital kymography (Wittenberg et al., 2000), the principles of videokymography (Svec and Schutte, 1996) are applied to high-speed video sequences. In order to create a digital kymogram (DKG), a line perpendicular to the vocal fold axis is selected within a high-speed video sequence, and the corresponding video data pixels are successively extracted for each video frame in the analyzed sequence. The extracted lines are concatenated to form the final graph. The DKGs created for this manuscript were generated with a Python script written by C.T.H. (Herbst, 2012), which was run as a plug-in within the FIJI image analysis software package (Schindelin et al., 2012). Digital kymography allows, amongst others, observation of the following features of laryngeal tissue vibration with high temporal resolution: (a) vertical phase differences between the lower (inferior) and the upper (superior) margins of the vocal folds (Baer, 1981; Titze et al., 1993b) – see supplementary material Fig. S2 for a schematic illustration; (b) mucosal waves (Hirano et al., 1981; Berke and Gerratt, 1993), i.e. air flow-driven travelling waves within the surface of the vocal fold tissue, moving along the trans-glottal air flow from the inferior to the superior vocal fold edge and then laterally across the upper vocal fold surface once every oscillatory cycle; and (c) vibratory asymmetries and different types of irregularities and cycle aberrations (Svec et al., 2007).

To enable quantitative analysis of the vibrating pattern along the entire length of the vocal folds, a clinically evaluated image processing procedure was applied, which is described in detail elsewhere (Lohscheller et al., 2007). Within each frame of the high-speed data, the algorithm extracts the medial edges of both vocal folds. This is used to create glottovibrograms (GVGs), i.e. a visualization technique that transfers information on the time-varying glottal width (as colour information) along the anterior–posterior (A–P) dimension into a single image (Lohscheller et al., 2008). Such images can also be used to objectively describe the 2D vibration type of glottal closure.

Anatomical terms of location

The phenomena we will describe are quite complex 3D events unfolding over time, and are difficult to verbalize. To make our descriptions as clear as possible, we will use plain English terms, oriented to the larynx itself, wherever possible (e.g. stating ‘front’ or ‘anterior’, rather than ‘rostro-ventral’ – see Fig. 1). Given that the structures of interest are oriented obliquely to the ‘anatomically correct’ rostral/caudal/dorsal/ventral axes, these terms would be very cumbersome. We hope this makes our descriptions clear and intelligible. Nonetheless, we suggest watching the videos included as supplementary material first (see supplementary material Movies 1–5).

Fig. 1.

Anatomical location terms used in this manuscript. Grey font: standard terms of location used in anatomy, illustrated schematically for an African elephant. Black font: terms of location used in human voice science, mapped onto the ‘intrinsic’ coordinate frame of the vocal folds, which are obliquely oriented in relation to the trachea (see also Fig. 3E,F). The terms ‘anterior’ and ‘posterior’ are adopted in this paper instead of ‘rostro-ventral’ and ‘caudo-dorsal’, respectively. Consequently, the terms ‘inferior’ and ‘superior’ are used to describe the axis normal to the anterior–posterior (A–P) dimension. For further simplification, the terms ‘front’, ‘back’, ‘lower’ and ‘upper’ are used interchangeably when describing phenomena relating to the vocal folds.

Fig. 1.

Anatomical location terms used in this manuscript. Grey font: standard terms of location used in anatomy, illustrated schematically for an African elephant. Black font: terms of location used in human voice science, mapped onto the ‘intrinsic’ coordinate frame of the vocal folds, which are obliquely oriented in relation to the trachea (see also Fig. 3E,F). The terms ‘anterior’ and ‘posterior’ are adopted in this paper instead of ‘rostro-ventral’ and ‘caudo-dorsal’, respectively. Consequently, the terms ‘inferior’ and ‘superior’ are used to describe the axis normal to the anterior–posterior (A–P) dimension. For further simplification, the terms ‘front’, ‘back’, ‘lower’ and ‘upper’ are used interchangeably when describing phenomena relating to the vocal folds.

Anatomical data

The general anatomical configuration of the elephant larynx, exhibiting its close connection to the hyoid apparatus and its transition into the trachea, is shown in Fig. 2 and Fig. 3C.

The vocal folds of the elephant are long and voluminous (see Table 1). They attach far anteriorly, close to the broad base of the epiglottis. Their attachments to the vocal processes of the arytenoid cartilages are located far posteriorly at the level of the cricoid arch. The vocal folds are arranged obliquely, at an angle of about 45 deg relative to the longitudinal axis of the trachea (see Figs 2, 3). Given the position of the larynx in relation to the trachea, it must be assumed that only the posterior three-fifths of the vocal folds are directly exposed to the passing air stream from the lungs and trachea. Immediately above to the inferior thyroid notch (where the anterior ends of the vocal folds attach), we observed a small area of distinctly higher ossification in comparison to the remaining parts of the thyroid cartilage (see Fig. 2).

In Fig. 3, the elephant larynx anatomy is juxtaposed with that of a human adult male to illustrate substantial configuration differences. The most obvious difference is found in the structural proportions. When normalizing the tracheal diameter just inferior to the cricoid cartilage in both the elephant and the human, fundamental dissimilarities in the laryngeal configuration become apparent (Fig. 3C–F): (1) relative to the human vocal fold, the elephant vocal fold is ca. 88% longer and ca. 180% thicker, and its cross-sectional area is ca. 406% larger – see Fig. 3E,F; (2) the vocal fold of the human is oriented nearly perpendicularly to the longitudinal axis of the trachea (and hence the tracheal air stream). In contrast, the elephant vocal fold is tilted by an angle of 45 deg; (3) whereas the human vocal fold borders upon the tracheal space along almost its entire length, the elephant vocal fold is positioned more anteriorly, such that the anterior two-fifths of the vocal fold are not directly adjacent to the tracheal space. As a consequence, simplified air stream–tissue interactions, which are assumed in most physical models of human or canine vocal fold vibration (Ishizaka and Flanagan, 1972; Kob, 2004), might have to be revisited when modelling and simulating sound generation in the elephant larynx.

Fig. 2.

Anatomy of the excised elephant larynx and parts of the hyoid apparatus. (A) Anterior view and (B) left lateral view of cartilage (light grey) and bone (dark grey) from 3D volume rendering of computed tomography (CT) data.

Fig. 2.

Anatomy of the excised elephant larynx and parts of the hyoid apparatus. (A) Anterior view and (B) left lateral view of cartilage (light grey) and bone (dark grey) from 3D volume rendering of computed tomography (CT) data.

Periodic vocal fold vibration

A typical example of a periodic infrasonic vocalization created by the excised elephant larynx in preparation stage a is illustrated in Fig. 4 (see also supplementary material Movie 1). Phonation was induced in the larynx with a tracheal pressure of ca. 6 kPa (which is about half the maximum expiratory pressure found in human females) (see Baken and Orlikoff, 2000), resulting in a regular, symmetrical oscillation of laryngeal tissue at ca. 15 Hz. The vestibular folds vibrated regularly (without collision) at the same frequency as the vocal folds, with a phase delay of ca. 180 deg as compared with the superior margins of the vocal folds.

The glottis (i.e. the visible air space between the vocal folds) was closed during ca. 82% of each vibratory cycle (resulting in a closed quotient of ca. 82%). This is considerably higher than, for example, the values measured for human speech (Baken and Orlikoff, 2000; Lohscheller et al., 2012). Such a surprisingly large closed quotient value was facilitated by a phase difference between the inferior and the superior vocal fold edge: the initiation of vocal fold contact at the superior vocal fold edge was delayed by ca. 27 ms as compared with that of the inferior vocal fold edge. Thus, the two vocal fold edges vibrated with a phase delay of ca. 150 deg (inferior edge leading, see Fig. 4C), leading to the large closed quotient value of ca. 82% (see Fig. 4C). Assuming a vocal fold thickness of 32 mm, the propagation speed of glottal closure (and thus the mucosal wave speed) along the inferior–superior dimension can be estimated as ca. 1.19 m s−1, which is comparable to data from excised canine larynges (Baer, 1981; Titze et al., 1993b).

Analysis of the time-synchronous microphone signal and high-speed video data suggests that acoustic energy was created at two instances within each vibratory cycle. The main excitation event occurred at the separation of the superior vocal fold edges, i.e. at the presumed onset of glottal air flow. A secondary, less pronounced excitation event was found at the initiation of vocal fold contact at the inferior vocal fold edge, i.e. at the presumed termination of glottal air flow. These two events are marked by the two vertical arrows in Fig. 4B,C.

Fig. 3.

Comparison of elephant and human laryngeal geometry. (A) Virtual transverse section through the elephant larynx, based on 3D multi-planar reformatting (MPR). The oblique red dashed line marks the extraction of the sagittal 3D MPR section shown in C. (B) Virtual transverse section, based on 3D MPR, of a human larynx. The oblique red dashed line marks the extraction of the sagittal 3D MPR section shown in D. (C,D) Virtual sagittal 3D MPR sections through elephant larynx and human larynx, respectively, scaled to have the same relative tracheal diameter, as measured just below the cricoid cartilage. The scale on the side of each image has a total length of 10 cm. (1) Thyroid cartilage, (2) epiglottis, (3) vestibular fold, (4) arytenoid cartilage, (5) vocal fold, (6) cricoid cartilage, (7) trachea. (E,F) Schematic illustration of laryngeal structures, vocal folds and airways, created from data shown in C and D, respectively. The angle of the vocal fold in relation to the trachea is indicated (γ).

Fig. 3.

Comparison of elephant and human laryngeal geometry. (A) Virtual transverse section through the elephant larynx, based on 3D multi-planar reformatting (MPR). The oblique red dashed line marks the extraction of the sagittal 3D MPR section shown in C. (B) Virtual transverse section, based on 3D MPR, of a human larynx. The oblique red dashed line marks the extraction of the sagittal 3D MPR section shown in D. (C,D) Virtual sagittal 3D MPR sections through elephant larynx and human larynx, respectively, scaled to have the same relative tracheal diameter, as measured just below the cricoid cartilage. The scale on the side of each image has a total length of 10 cm. (1) Thyroid cartilage, (2) epiglottis, (3) vestibular fold, (4) arytenoid cartilage, (5) vocal fold, (6) cricoid cartilage, (7) trachea. (E,F) Schematic illustration of laryngeal structures, vocal folds and airways, created from data shown in C and D, respectively. The angle of the vocal fold in relation to the trachea is indicated (γ).

Synchronized vibration of vocal folds and vestibular folds

The effect of vestibular fold oscillation is illustrated in Fig. 5 (see also supplementary material Movie 2). Over the course of 3 s, the manual adduction (i.e. approximation) of the vestibular folds was gradually increased. As a result, the already vibrating vestibular folds started to collide around t=1.5 s, leading to a SPL increase of more than 12 dB over the entire sequence. As in the previous example, the vestibular folds vibrated with a phase delay of ca. 180 deg (in relation to the vibration of the vocal folds). As a consequence of the periodic collision of the vestibular folds during the second half of the sequence, a synchronized ‘airlock’ oscillation of the vocal folds and the vestibular folds emerged. In this kind of vibratory pattern, the vocal folds and the vestibular folds formed a single coupled system with alternating contact of either of the involved tissue structures, such that the glottis was never visible. This system was presumably very efficient in the conversion of aerodynamic energy into mechanical tissue vibrations, and hence the generation of acoustic energy.

Complex ‘double zipper’ vocal fold oscillation

A case of a complex nearly periodic oscillatory pattern of the vestibular and vocal folds, obtained with a tracheal pressure of 6 kPa, is documented in Fig. 6 (see also supplementary material Movie 3). The most striking feature of this oscillatory pattern is the simultaneous ‘double zipper’ oscillation of the superior and inferior edges of the vocal folds. A ‘zipper’ oscillation (Childers et al., 1986; Hess and Ludwigs, 2000) is characterized by A–P phase differences, strongly resembling the x–21 vibratory mode described previously (Berry et al., 1994; Titze, 2000) (see supplementary material Fig. S3).

Table 1.

Anatomical measurements of the excised elephant larynx

Anatomical measurements of the excised elephant larynx
Anatomical measurements of the excised elephant larynx

Three visible phases of this pattern (during one cycle) can be distinguished: (1) the separation of the upper (superior) vocal fold margins, starting in the front (image 4 in Fig. 6D,E) and propagating to the back; (2) the approximation/contacting of the lower edge of the vocal folds, starting in the back (image 9 in Fig. 6D,E) and propagating to the front; and (3) the approximation/contacting of the upper vocal fold edges, starting in the front (image 10–11 in Fig. 6D,E) and propagating to the back. The separation of the lower edges, presumably occurring during the first 15 ms of the cycle, is obscured by the contacting upper vocal fold edges.

Analysis of the instant of complete approximation/closure of both superior and inferior vocal fold edges revealed a phase difference of ca. 83 deg. The two vocal fold edges were 180 deg out of phase along the A–P axis. The vestibular folds participated in the nearly periodic vibration at a phase difference of ca. 180 deg (as compared with the superior vocal fold edges), just as in the examples shown in Figs 4 and 5.

The mid-sagittal DKG shown in Fig. 6C reveals that both the posterior commissure of the vocal folds and the corniculate processes of the arytenoid cartilages vibrated synchronously with the vestibular and vocal folds, providing evidence for an A–P longitudinal vibratory mode. The posterior commissure of the vocal folds reached its maximum posterior position shortly after the superior vocal fold edges were maximally separated (see lower sinusoidal dashed line in Fig. 6C; supplementary material Fig. S4). The corniculate processes of the arytenoids vibrated with a lesser amplitude and a phase difference of ~100–110 deg as compared with the posterior commissure of the vocal folds (see Fig. 6C; supplementary material Fig. S4).

Analysis of synchronous tissue vibrations (Fig. 6C) and the generated sound (Fig. 6A,B) revealed that the mechanical aspect of the oscillation of laryngeal tissue was more regular than the acoustic output. Minor perturbations in the laryngeal tissue mechanics might have had large impacts on the resulting time-varying air flow, suggesting the presence of complex aerodynamic phenomena.

Irregular vocal fold vibration

A case of irregular vocal fold vibration is illustrated in Fig. 7. Spectral analysis of the acoustic signal revealed only two clear harmonics (Fig. 7A), while most of the spectrum was characterized by non-harmonic energy. Overall, the oscillation of laryngeal tissue assumed a nearly periodic pattern, which was particularly true for the vestibular folds (Fig. 7C,E). The irregularities in the acoustic signal were presumably introduced by complex irregular vibratory modes along the A–P axis (see supplementary material Movie 4).

Fig. 4.

Regular flow-induced periodic vocal fold vibration (fundamental frequency ca. 15 Hz) in the excised elephant larynx, preparation stage a (see ‘Excised larynx setup’, Materials and methods). (A) Sound pressure level (SPL) measured at a distance of 30 cm from the vocal folds (blue) and averaged air flow (orange). (B) Acoustic signal. The red arrows at t≈0.72 s indicate a ‘double acoustic excitation’ within one oscillatory cycle, created by both the de-contacting and contacting event of the vocal folds. a.u., arbitrary units. (C) Digital kymogram (DKG). The DKG scan line was placed on a froth particle, allowing the superior edge of the left vocal fold to be traced. (D) Top view of vocal folds and vestibular folds of the excised larynx. The scan line for creating the DKG shown in C is depicted in yellow.

Fig. 4.

Regular flow-induced periodic vocal fold vibration (fundamental frequency ca. 15 Hz) in the excised elephant larynx, preparation stage a (see ‘Excised larynx setup’, Materials and methods). (A) Sound pressure level (SPL) measured at a distance of 30 cm from the vocal folds (blue) and averaged air flow (orange). (B) Acoustic signal. The red arrows at t≈0.72 s indicate a ‘double acoustic excitation’ within one oscillatory cycle, created by both the de-contacting and contacting event of the vocal folds. a.u., arbitrary units. (C) Digital kymogram (DKG). The DKG scan line was placed on a froth particle, allowing the superior edge of the left vocal fold to be traced. (D) Top view of vocal folds and vestibular folds of the excised larynx. The scan line for creating the DKG shown in C is depicted in yellow.

Fig. 5.

Gradual engagement of vestibular folds (larynx preparation stage a). (A) SPL measured at a distance of 30 cm from the vocal folds (blue) and averaged air flow (orange). The SPL increased by 12 dB over the entire sample. (B) Acoustic signal. (C) DKG. Note the gradual increase of vestibular fold oscillation, starting to collide at t≈1.5 s. (D) enlarged DKGs, extracted at t=0.5–0.8 s (left) and t=2.5–2.8 s (right). The collision of the vestibular folds is evident in the DKG shown in the right panel.

Fig. 5.

Gradual engagement of vestibular folds (larynx preparation stage a). (A) SPL measured at a distance of 30 cm from the vocal folds (blue) and averaged air flow (orange). The SPL increased by 12 dB over the entire sample. (B) Acoustic signal. (C) DKG. Note the gradual increase of vestibular fold oscillation, starting to collide at t≈1.5 s. (D) enlarged DKGs, extracted at t=0.5–0.8 s (left) and t=2.5–2.8 s (right). The collision of the vestibular folds is evident in the DKG shown in the right panel.

Alternating A–P transverse travelling waves

As an unusual case of vocal fold oscillation, intermittent episodes of alternating A–P travelling waves were obtained during larynx preparation stage b (see supplementary material Movie 5). These travelling waves occurred at a high degree of manually induced vocal fold elongation, resulting in a fundamental frequency of ca. 27 Hz. One of these episodes is illustrated in Fig. 8. It is characterized by the alternating occurrence of a posterior–anterior (P–A) and an A–P ‘zipper-like’ glottal opening/closure pattern.

This manuscript presents unique data from laryngeal oscillations in an excised elephant larynx, the largest animal sound generator ever studied experimentally. As shown in a previous publication (Herbst et al., 2012), the basic sound production mechanism of elephant vocalizations at the observed fundamental frequencies is similar to that seen in humans and many other mammals, and consists of flow-induced self-sustaining oscillations of laryngeal tissues. Here, an in-depth analysis of the experimentally observed oscillatory patterns in elephant voice production is presented, including some complex phenomena so far not documented in the voice science literature. The data include various combinations of: phase differences of the superior and inferior vocal fold edge; simultaneous phase-shifted A–P ‘double zipper’ oscillation of the superior and inferior vocal fold edges; oscillation of the vestibular folds, simultaneous with vocal fold oscillation, resulting in increased acoustic energy; A–P longitudinal vibratory modes, involving the (apexes of the) arytenoid cartilages; and alternating A–P and P–A travelling waves.

Anatomical considerations

Although without further specimens we cannot rule out the possibility that the observed complex biomechanical phenomena might be unique for the investigated sample, we hypothesize that they are typical features resulting from the anatomy of the elephant larynx. The most obvious difference from previous studies in other species is larynx size and position: the elephant vocal fold is five times longer and three times thicker than that of the largest species (Panthera tigris) so far examined in an excised larynx setup (Titze et al., 2010). However, the elephant larynx is not just a linearly scaled version of the human larynx. When considering the normalized dimensions (scaled to similar tracheal diameter), the elephant vocal fold is still several times larger than that of a human (see Fig. 3C–F).

The vocal fold is oriented to the tracheal air stream at an oblique angle, suggesting that the anterior two-fifths of the vocal folds are not in the direct line of the tracheal air stream. This may lead to complex interactions between the trans-glottal flow and the tissue mechanics, which are not found in humans, in whom the entire vocal fold is positioned almost perpendicular to and entirely within the tracheal air flow.

Fig. 6.

Complex vibratory pattern of elephant vocal folds (larynx preparation stage a). (A) Spectrum of acoustic signal. (B) Acoustic signal. (C) Left panel: DKG, extracted along a mid-sagittal axis of the elephant vocal folds. The vibration of both the posterior commissure of the vocal folds (i.e. the most posteriorly visible portion of the vocal fold) and the apexes of the arytenoids (at the point where the corniculate processes of the arytenoids touch) is illustrated by two dashed sinusoids (top and centre of left-half of the DKG, respectively). The vertical dashed line is drawn to visualize the phase offset between the vibration of the apexes of the arytenoids and the vibrating posterior commissure (see supplementary material Fig. S4 for further details). Right panel: top view of vocal folds, showing the scan line for extraction of the DKG displayed in the left panel. (D) Sequence of high-speed images (extracted every 6.67 ms). Note the A–P phase delay along the superior vocal fold edge, and the posterior–anterior (P–A) phase delay along the inferior vocal fold edge (seen in images 7–12 of the sequence). (E) Manual tracing of structures seen in D. Note the simultaneous approximation/contacting pattern of the superior and inferior edges of the vocal folds, moving in opposite A–P directions (images 9–12 of the sequence).

Fig. 6.

Complex vibratory pattern of elephant vocal folds (larynx preparation stage a). (A) Spectrum of acoustic signal. (B) Acoustic signal. (C) Left panel: DKG, extracted along a mid-sagittal axis of the elephant vocal folds. The vibration of both the posterior commissure of the vocal folds (i.e. the most posteriorly visible portion of the vocal fold) and the apexes of the arytenoids (at the point where the corniculate processes of the arytenoids touch) is illustrated by two dashed sinusoids (top and centre of left-half of the DKG, respectively). The vertical dashed line is drawn to visualize the phase offset between the vibration of the apexes of the arytenoids and the vibrating posterior commissure (see supplementary material Fig. S4 for further details). Right panel: top view of vocal folds, showing the scan line for extraction of the DKG displayed in the left panel. (D) Sequence of high-speed images (extracted every 6.67 ms). Note the A–P phase delay along the superior vocal fold edge, and the posterior–anterior (P–A) phase delay along the inferior vocal fold edge (seen in images 7–12 of the sequence). (E) Manual tracing of structures seen in D. Note the simultaneous approximation/contacting pattern of the superior and inferior edges of the vocal folds, moving in opposite A–P directions (images 9–12 of the sequence).

The forces created by the longitudinal vibratory modes, together with the phase-delayed oscillations of the tissue surrounding the vocal folds (see supplementary material Movies 1–5) might have led to the development of the ossified portion of the thyroid cartilage at the anterior commissure of the vocal folds (see Fig. 2). Such an ossification regularly occurs in mammalian larynges with increasing age. The ossification process is a self-regulating adaptation of the connective and supporting tissue to mechanical stress. Ossification preferentially occurs in areas of deformation forces transferred to the laryngeal framework by contracting muscles. Laryngeal anatomy and the pattern of forces acting on the larynx differ between species and, therefore, the pattern of ossification is species specific (von Glass and Pech, 1983), as well as individually variable.

Transverse travelling wave patterns

The unusual layout of the elephant larynx might also facilitate a special vocal fold vibratory pattern: alternating A–P and P–A travelling waves. Such an oscillation phenomenon has not previously been documented in the literature, and it might elude explanation with the models so far developed for describing flow-induced vocal fold vibration.

As a first approximation, the vocal fold can be considered as a string that is suspended between one rigid boundary (the osseous portion of the thyroid cartilage at the anterior commissure – see Fig. 2) and one virtually free boundary (the vocal process of the arytenoid cartilage). The propagation speed (v, m s−1) of a wave along an ideal string can be expressed as:
formula
(1)
where Θ is the tension (measured in Newtons; note, tension is usually indicated by the symbol T; however, as T will be used for the period of vocal fold vibration in this paper, tension is indicated by the Greek symbol Θ), m is the mass, L is the length and μ=m/L is the mass per unit length of the string (Serway, 1990).

Based on CT data, the approximate dimensions of such a string model of the elephant vocal fold are determined by a length L of ca. 10 cm, a thickness t of 2.5 cm, and a width w of 2 cm. When assuming a tissue density ρ of 1.04 g cm−3, the mass of one vocal fold is estimated as 52 g. The oscillation shown in Fig. 8 had a period of 37 ms, during which the longitudinally travelling wave covered a distance of twice the vocal fold length, resulting in a wave speed of 5.4 m s−1. Inserting these values into Eqn 1 and solving for Θ predicts a tension of ca. 15.16 N, and consequently a tensile stress in the vocal fold of ca. 30.33 kPa (obtained by dividing the tension by the cross-sectional area of the vocal fold). Vocal fold tensile stress of this magnitude is well within the range of both theoretical predictions (Titze, 1994) and experimental data (Alipour and Vigmostad, 2012) for other mammals.

Fig. 7.

Irregular vibration of vocal folds and vestibular folds (larynx preparation stage a). (A) Spectrum of acoustic signal. (B) Acoustic signal. (C) DKG. The DKG scan line was placed on a froth particle, thereby allowing better visualization of the vibratory pattern of the left vestibular fold. (D) Top view of vocal folds and vestibular folds of the excised larynx. The scan line for creating the DKG shown in C is depicted in yellow. (E) Enlarged DKG section from C, extracted at t=0.5–0.9 s.

Fig. 7.

Irregular vibration of vocal folds and vestibular folds (larynx preparation stage a). (A) Spectrum of acoustic signal. (B) Acoustic signal. (C) DKG. The DKG scan line was placed on a froth particle, thereby allowing better visualization of the vibratory pattern of the left vestibular fold. (D) Top view of vocal folds and vestibular folds of the excised larynx. The scan line for creating the DKG shown in C is depicted in yellow. (E) Enlarged DKG section from C, extracted at t=0.5–0.9 s.

In analogy to, e.g. a guitar string model, the vocal fold is ‘plucked’ (i.e. deflected laterally) in its posterior three-fifths by an aerodynamically induced glottal opening event. (By hypothesis, the glottal opening cannot originate in the anterior two-fifths of the vocal folds, as this region has no direct contact with the air stream coming from the trachea – see the schematic illustration in Fig. 3E.) In the simplified model proposed here, the lateral deflection is first only propagated anteriorly (driven by the trans-glottal air flow that interacts with the vocal fold at an oblique angle of ca. 45 deg). (In a more complex model, the lateral deflection might initially also propagate posteriorly, but the reflection from the posterior end is considerably damped at the free boundary so that the standing wave cannot fully emerge.) The travelling glottal deflection pulse is then reflected by the rigid anterior boundary, and then propagated posteriorly, where it is dampened by the softer boundary at the vocal processes of the arytenoids.

Three observed vocal fold vibration scenarios can be explained by this model, depending on (a) the temporal delay between the initiation of two successive lateral deflections, determined by the period T of vocal fold vibration (which is the reciprocal of the fundamental frequency), and (b) the time τ (s) that is required for a travelling wave to complete an entire round trip along the A–P dimension of the vocal folds, defined by:
formula
(2)
where L is the vocal fold length and v is the speed of the travelling mucosal wave (Eqn 1).

The following three scenarios are described with the assumption that lateral deflections are initiated at the same position along the A–P axis: (1) if the deflection pulse periodically completes exactly one round-trip along the vocal fold before the next deflection pulse occurs (τ=T), an alternating P–A and A–P travelling wave pattern will emerge; (2) if the consecutive deflection pulses are delayed such that a spatio-temporal synchronization between consecutive deflections is achieved where τ=2T, an oscillatory pattern characterized by a strong P–A phase delay is likely to emerge; (3) in any other case, if no spatio-temporal synchronization between travelling waves and consecutive deflection pulses is achieved, an irregular vocal fold vibratory pattern is likely to occur.

The travelling wave pattern is facilitated by two biomechanical features: (1) a low fundamental frequency, which effectively increases the temporal delay d between two successive lateral deflections and thus allows the travelling wave to complete an entire round trip along the vocal fold before initiation of the succeeding lateral vocal fold deflection (caused by an air pulse); (2) an increased longitudinal tension in the vocal folds, leading to an increased speed of the travelling wave. Both these facilitating conditions were fulfilled in the example shown in Fig. 8.

Fig. 8.

Travelling wave in elephant vocal fold oscillation (larynx preparation stage b) at ca. 27 Hz. (A) Individual pictures from high-speed video data, demonstrating a P–A (left group) and an A–P (right group) travelling wave. (B) Glottovibrogram of 2.5 cycles of a travelling wave in the elephant vocal folds. The colour-coded glottal width (measured in pixels) for combinations of time (x-axis) and A–P position along the vocal folds (y-axis) is displayed on the z-axis. The dotted arrows indicate the directions of the two travelling waves shown in A.

Fig. 8.

Travelling wave in elephant vocal fold oscillation (larynx preparation stage b) at ca. 27 Hz. (A) Individual pictures from high-speed video data, demonstrating a P–A (left group) and an A–P (right group) travelling wave. (B) Glottovibrogram of 2.5 cycles of a travelling wave in the elephant vocal folds. The colour-coded glottal width (measured in pixels) for combinations of time (x-axis) and A–P position along the vocal folds (y-axis) is displayed on the z-axis. The dotted arrows indicate the directions of the two travelling waves shown in A.

The model presented here is not intended to offer a complete explanation of flow-driven self-sustaining oscillation, but it might prove to be a useful addition to the already well-established descriptions of vocal fold vibration. In particular, it may assist in explaining vibratory phenomena in species with very long vocal folds, and it might also shed new light on the phenomenon of ‘zipper-like’ glottal opening and closure in humans and other mammals.

Creation of acoustic energy and the role of the vestibular folds

The data presented in this manuscript suggests that acoustic energy in elephant vocalization is mainly produced at the instant of glottal opening, i.e. during the commencement of trans-glottal air flow (recall Fig. 4). The instant of glottal closure (the cessation of trans-glottal air flow) contributes to a lesser extent to the production of acoustic energy. This is a surprising finding, as it is in contrast to what is known about sound generation in humans, where most of the acoustic energy is created during the instant of glottal closure (Miller and Schutte, 1984; Schutte and Miller, 1988). Further studies are required to make detailed measurements of the time-varying trans-glottal air flow (only averaged air flow rates have been measured in this study), and to investigate the complex aerodynamic phenomena that appear to be found in the elephant larynx during sound generation.

The vestibular folds might play a crucial role in ordinary elephant sound production. Even when not colliding, they tended to vibrate at the same fundamental frequency (but with a phase delay) as the vocal folds in the excised larynx, thus forming a coupled oscillator with the vocal folds. When colliding during periodic sound generation, the vestibular folds were found to facilitate an ‘airlock’ oscillation where the glottis was never visible (see Fig. 5). Such a 180 deg phase-shifted 1:1 entrainment of vestibular fold and vocal fold oscillation might enhance the transfer of aerodynamic energy into the vibrating tissue (Titze, 1988), increasing the output sound level (+12 dB in the case documented here) and thus the efficiency of the oscillator. These results are in line with previous research (Finnegan and Alipour, 2009), highlighting the importance of supraglottal tissue structures in sound generation, similar to what has been documented for humans (Fuks et al., 1998; Lindestad and Södersten, 1999; Sakakibara et al., 2004; Bailly et al., 2010).

Physiological relevance

To date, no physiological data on in vivo elephant voice production are available. In particular, in contrast to human or canine phonation (Zemlin, 1988; Hunter et al., 2004; Herbst et al., 2011; Chhetri et al., 2012), the subglottal air pressure ranges and the exact laryngeal configuration for vocal fold adduction in elephants are not known. The position of the arytenoid cartilages in the excised larynx experiment had to be inferred from careful examination of the available CT data and the functional and mechanical possibilities offered by the excised elephant larynx. Whether the adductory manoeuvres performed in this study exactly resemble those of in vivo vocalization would need to be established in future studies, which will be very challenging to perform as direct endoscopic evaluation of vocal fold vibration in a live animal is virtually impossible with the current technological means. However, several arguments speak in favour of a faithful duplication of natural vocalization conditions: (1) the laryngeal configuration was created in the only way possible to easily induce phonation in our excised larynx; (2) similar adductory gestures to the ones we used in the excised larynx experiments have been documented in humans; and (3) the acoustic output of the excised larynx was closely comparable to sounds captured from in vivo vocalizations, with fundamental frequencies that were well within the range of those reported for the ‘rumble’ call type (Poole et al., 1988; Langbauer, 2000; Herbst et al., 2012; Stoeger et al., 2012).

The mammalian larynx is a non-linear system capable of exhibiting a wide range of vibratory behaviour, such as periodic vibration, subharmonics and deterministic chaos (Titze et al., 1993a; Herzel et al., 1995; Behrman and Baken, 1997; Fitch et al., 2002; Neubauer et al., 2004; Jiang et al., 2006). In such a system, small changes of boundary conditions can lead to fundamentally different oscillation patterns (Berry et al., 1996; Svec et al., 1999; Tokuda et al., 2008; Herbst et al., 2013). Consequently, the different vibratory regimes documented in this study may have arisen from subtle differences in the adduction of the arytenoids across various flow-induced vocalizations with comparable air pressure conditions.

Conclusions

The elephant larynx is the largest oscillator for mammalian voice production that has received experimental study to date. It is able to produce a wide variety of complex vibratory phenomena, such as: simultaneous ‘double-zippering’ of the superior and inferior vocal fold edges; ‘airlock’ oscillations involving the vestibular folds (increasing the efficiency of the coupled oscillator as a sound generation device); and transverse travelling waves along the A–P axis. With respect to the transverse travelling waves, we propose a new model augmented by flow-driven travelling waves. Such a model is capable of explaining travelling waves, ‘conventional’ standing wave vocal fold vibration and irregular vocal fold vibration.

Our sincere thanks go to Dr Bernhard Blaszkiewitz (Direktor, Tierpark Berlin Friedrichsfelde) for supplying us with the elephant larynx. We thank R. Hofer for contributing to the setup of the excised larynx experiment and P. Pesak for assisting in the computed tomography scan of the larynx specimen.

FUNDING

This research was supported by European Research Council (ERC) Advanced Grant ‘SOMACCA’ (C.T.H. and W.T.F.); a start-up grant from the University Vienna (W.T.F.); the European Social Fund Project OP VK CZ.1.07/2.3.00/20.0057 (J.G.S.); a grant by the Deutsche Forschungsgemeinschaft (DFG) grant no. LO1413/2-2 (J.L.); and an Austrian Science Fund (FWF) grant P2309921 (A.S.S.).

     
  • A–P

    anterior–posterior

  •  
  • CT

    computed tomography

  •  
  • DKG

    digital kymogram

  •  
  • EGG

    electroglottography

  •  
  • GVG

    glottovibrogram

  •  
  • MPR

    multi-planar reformatting

  •  
  • P–A

    posterior–anterior

  •  
  • SPL

    sound pressure level

Alipour
F.
,
Vigmostad
S.
(
2012
).
Measurement of vocal folds elastic properties for continuum modeling
.
J. Voice
26
,
816.e21
-
816.e29
.
Baer
T.
(
1981
).
Observation of vocal fold vibration: measurements of excised larynges
. In
Vocal Fold Physiology
(ed.
Stevens
K.
,
Hirano
M.
), pp.
119
-
133
.
Tokyo
:
University of Tokyo Press
.
Bailly
L.
,
Henrich
N.
,
Pelorson
X.
(
2010
).
Vocal fold and ventricular fold vibration in period-doubling phonation: physiological description and aerodynamic modeling
.
J. Acoust. Soc. Am.
127
,
3212
-
3222
.
Baken
R. J.
(
1992
).
Electroglottography
.
J. Voice
6
,
98
-
110
.
Baken
R. J.
,
Orlikoff
R. F.
(
2000
).
Clinical Measurement of Speech and Voice
, 2nd edn.
San Diego, CA
:
Singular Thompson Learning
.
Behrman
A.
,
Baken
R. J.
(
1997
).
Correlation dimension of electroglottographic data from healthy and pathologic subjects
.
J. Acoust. Soc. Am.
102
,
2371
-
2379
.
Berke
G. S.
,
Gerratt
B. R.
(
1993
).
Laryngeal biomechanics: an overview of mucosal wave mechanics
.
J. Voice
7
,
123
-
128
.
Berry
D. A.
,
Herzel
H.
,
Titze
I. R.
,
Krischer
K.
(
1994
).
Interpretation of biomechanical simulations of normal and chaotic vocal fold oscillations with empirical eigenfunctions
.
J. Acoust. Soc. Am.
95
,
3595
-
3604
.
Berry
D. A.
,
Herzel
H.
,
Titze
I. R.
,
Story
B. H.
(
1996
).
Bifurcations in excised larynx experiments
.
J. Voice
10
,
129
-
138
.
Bradbury
J. W.
,
Vehrencamp
S. L.
(
1998
).
Principles of Animal Communication
.
Sunderland, MA
:
Sinauer Associates
.
Chhetri
D. K.
,
Neubauer
J.
,
Berry
D. A.
(
2012
).
Neuromuscular control of fundamental frequency and glottal posture at phonation onset
.
J. Acoust. Soc. Am.
131
,
1401
-
1412
.
Childers
D. G.
,
Hicks
D. M.
,
Moore
G. P.
,
Alsaka
Y. A.
(
1986
).
A model for vocal fold vibratory motion, contact area, and the electroglottogram
.
J. Acoust. Soc. Am.
80
,
1309
-
1320
.
Döllinger
M.
,
Kobler
J. B.
,
Berry
D.
,
Mehta
D.
,
Luegmair
G.
,
Bohr
C.
(
2011
).
Experiments on analysing voice production: excised (human, animal) and in vivo (animal) approaches
.
Current Bioinformatics
6
,
286
-
304
.
Fabre
P.
(
1957
).
Un procédé électrique percutane d'inscription de l'accolement glottique au cours de la phonation: glottographie de haute fréquence; premiers résultats. [Percutaneous electric process registering glottic union during phonation: glottography at high frequency; first results]
.
Bull. Acad Natl. Med.
141
,
66
-
69
.
Finnegan
E. M.
,
Alipour
F.
(
2009
).
Phonatory effects of supraglottic structures in excised canine larynges
.
J. Voice
23
,
51
-
61
.
Fitch
W. T.
,
Neubauer
J.
,
Herzel
H.
(
2002
).
Calls out of chaos: the adaptive significance of nonlinear phenomena in mammalian vocal production
.
Anim. Behav.
63
,
407
-
418
.
Fourcin
A. J.
,
Abberton
E.
(
1971
).
First applications of a new laryngograph
.
Med. Biol. Illus.
21
,
172
-
182
.
Fuks
L.
,
Hammarberg
B.
,
Sundberg
J.
(
1998
).
A self-sustained vocal-ventricular phonation mode: acoustical, aerodynamic and glottographic evidences
.
KTH TMH-QPSR (Stockholm)
3
,
49
-
59
.
Garstang
M.
(
2004
).
Long-distance, low-frequency elephant communication
.
J. Comp. Physiol. A
190
,
791
-
805
.
Hauser
M. D.
(
1996
).
The Evolution of Communication
.
Cambridge, MA
:
MIT Press
.
Herbst
C. T.
(
2012
).
DKG plugin for FIJI
, vol.
2013
.
Vienna, Austria
. http://www.christian-herbst.org/index.php?page=fiji
Herbst
C. T.
,
Qiu
Q.
,
Schutte
H. K.
,
Švec
J. G.
(
2011
).
Membranous and cartilaginous vocal fold adduction in singing
.
J. Acoust. Soc. Am.
129
,
2253
-
2262
.
Herbst
C. T.
,
Stoeger
A. S.
,
Frey
R.
,
Lohscheller
J.
,
Titze
I. R.
,
Gumpenberger
M.
,
Fitch
W. T.
(
2012
).
How low can you go? Physical production mechanism of elephant infrasonic vocalizations
.
Science
337
,
595
-
599
.
Herbst
C. T.
,
Herzel
H.
,
Svec
J. G.
,
Wyman
M. T.
,
Fitch
W. T.
(
2013
).
Visualization of system dynamics using phasegrams
.
J. R. Soc. Interface
10
,
20130288
.
Herzel
H.
(
1995
).
Non-linear dynamics of voiced speech
. In
Nonlinear Dynamics: New Theoretical and Applied Results
(ed.
Awrejcewicz
J.
).
Berlin
:
Akademie Verlag
.
Herzel
H.
,
Berry
D.
,
Titze
I.
,
Steinecke
I.
(
1995
).
Nonlinear dynamics of the voice: signal analysis and biomechanical modeling
.
Chaos
5
,
30
-
34
.
Hess
M. M.
,
Ludwigs
M.
(
2000
).
Strobophotoglottographic transillumination as a method for the analysis of vocal fold vibration patterns
.
J. Voice
14
,
255
-
271
.
Hirano
M.
,
Kakita
Y.
,
Kawasaki
H.
,
Gould
W. J.
,
Lambiase
A.
(
1981
).
Data from high-speed motion picture studies
. In
Vocal Fold Physiology
(ed.
Stevens
K. N.
,
Hirano
M.
), pp.
85
-
93
.
Tokyo
:
University of Tokyo Press
.
Hunter
E. J.
,
Titze
I. R.
,
Alipour
F.
(
2004
).
A three-dimensional model of vocal fold abduction/adduction
.
J. Acoust. Soc. Am.
115
,
1747
-
1759
.
Ishizaka
K.
,
Flanagan
J. L.
(
1972
).
Synthesis of voiced sounds from a two-mass model of the vocal cords
.
Bell Syst. Tech. J.
51
,
1233
-
1268
.
Jiang
J. J.
,
Zhang
Y.
,
McGilligan
C.
(
2006
).
Chaos in voice, from modeling to measurement
.
J. Voice
20
,
2
-
17
.
Kob
M.
(
2004
).
Singing voice modeling as we know it today
.
Acta Acustica united with Acustica
90
,
649
-
661
.
Kühhaas
P.
(
2011
).
Morphologie des Larynx des Afrikanischen Elefanten (Loxodonta africana)
.
PhD dissertation
,
University of Veterinary Medicine
,
Vienna, Austria
.
Langbauer
W. R.
(
2000
).
Elephant communication
.
Zoo Biol.
19
,
425
-
445
.
Leong
K. M.
,
Ortolani
A.
,
Burks
K. D.
,
Mellen
J. D.
,
Savage
A.
(
2003
).
Quantifying acoustic and temporal characteristics of vocalizations for a group of captive African elephants Loxodonta africana
.
Bioacoustics
13
,
213
-
231
.
Lindestad
P.
,
Södersten
M.
(
1999
).
Voice source characteristics in Mongolian ‘throat singing’ studied with high speed imaging technique, acoustic spectra and inverse filtering
.
Phoniatric and Logopedic Progress Report
11
,
17
-
26
.
Lohscheller
J.
,
Toy
H.
,
Rosanowski
F.
,
Eysholdt
U.
,
Döllinger
M.
(
2007
).
Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos
.
Med. Image Anal.
11
,
400
-
413
.
Lohscheller
J.
,
Eysholdt
U.
,
Toy
H.
,
Dollinger
M.
(
2008
).
Phonovibrography: mapping high-speed movies of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics
.
IEEE Trans. Med. Imaging
27
,
300
-
309
.
Lohscheller
J.
,
Svec
J. G.
,
Döllinger
M.
(
2012
).
Vocal fold vibration amplitude, open quotient, speed quotient and their variability along glottal length: kymographic data from normal subjects
.
Logoped. Phoniatr. Vocol.
doi:10.3109/14015439.2012.731083
.
McComb
K.
,
Reby
D.
,
Baker
L.
,
Moss
C.
,
Sayialel
S.
(
2003
).
Long-distance communication of acoustic cues to social identity in African elephants
.
Anim. Behav.
65
,
317
-
329
.
Miller
D. G.
,
Schutte
H. K.
(
1984
).
Characteristic patterns of sub- and supraglottal pressure variations within the glottal cycle
. In
Transcripts of the XIIIth symposium: Care of the Professional Voice
(ed.
Lawrence
V.
).
New York, NY
:
The Voice Foundation
.
Neubauer
J.
,
Edgerton
M.
,
Herzel
H.
(
2004
).
Nonlinear phenomena in contemporary vocal music
.
J. Voice
18
,
1
-
12
.
Payne
K. B.
,
Langbauer
W. R.
,
Thomas
E. M.
(
1986
).
Infrasonic calls of the Asian elephant (Elephas maximus)
.
Behav. Ecol. Sociobiol.
18
,
297
-
301
.
Poole
J. H.
,
Payne
K.
,
Langbauer
W. R.
,
Moss
C. J.
(
1988
).
The social contexts of some very low frequency calls of African elephants
.
Behav. Ecol. Sociobiol.
22
,
385
-
392
.
Remmers
J. E.
,
Gautier
H.
(
1972
).
Neural and mechanical mechanisms of feline purring
.
Respir. Physiol.
16
,
351
-
361
.
Sakakibara
K.-I.
,
Fuks
L.
,
Imagawa
H.
,
Tayama
N.
(
2004
).
Growl voice in ethnic and pop styles
. In
Proceedings of the International Symposium on Musical Acoustics
(ed.
Naganuma
D.
),
Nara, Japan
.
Schindelin
J.
,
Arganda-Carreras
I.
,
Frise
E.
,
Kaynig
V.
,
Longair
M.
,
Pietzsch
T.
,
Preibisch
S.
,
Rueden
C.
,
Saalfeld
S.
,
Schmid
B.
, et al. 
. (
2012
).
Fiji: an open-source platform for biological-image analysis
.
Nat. Methods
9
,
676
-
682
.
Schutte
H. K.
,
Miller
D. G.
(
1988
).
Resonanzspiele der Gesangsstimme in ihren Beziehungen zu supra- und subglottalen Druckverläufen: Konsequenzen für die Stimmbildungstheorie
.
Folia Phoniatr. (Basel)
40
,
65
-
73
.
Serway
R. A.
(
1990
).
Wave motion
. In
Physics for Scientists and Engineers with Modern Physics
, (ed.
Serway
R. A.
), pp.
430
-
454
.
Philadelphia, PA
:
Saunders College Publishing
.
Sissom
D.
,
Rice
D.
,
Peters
G.
(
1991
).
How cats purr
.
J. Zool.
223
,
67
-
78
.
Soltis
J.
,
Leong
K.
,
Savage
A.
(
2005
).
African elephant vocal communication II: rumble variation reflects the individual identity and emotional state of callers
.
Anim. Behav.
70
,
589
-
599
.
Stoeger
A. S.
,
Heilmann
G.
,
Zeppelzauer
M.
,
Ganswindt
A.
,
Hensman
S.
,
Charlton
B. D.
(
2012
).
Visualizing sound emission of elephant vocalizations: evidence for two rumble production types
.
PLoS ONE
7
,
e48907
.
Stoeger-Horwath
A. S.
,
Stoeger
S.
,
Schwammer
H. M.
,
Kratochvil
H.
(
2007
).
Call repertoire of infant African elephants: first insights into the early vocal ontogeny
.
J. Acoust. Soc. Am.
121
,
3922
-
3931
.
Svec
J. G.
,
Schutte
H. K.
(
1996
).
Videokymography: high-speed line scanning of vocal fold vibration
.
J. Voice
10
,
201
-
205
.
Svec
J. G.
,
Schutte
H. K.
,
Miller
D. G.
(
1999
).
On pitch jumps between chest and falsetto registers in voice: data from living and excised human larynges
.
J. Acoust. Soc. Am.
106
,
1523
-
1531
.
Svec
J. G.
,
Horácek
J.
,
Sram
F.
,
Veselý
J
(
2000
).
Resonance properties of the vocal folds: in vivo laryngoscopic investigation of the externally excited laryngeal vibrations
.
J. Acoust. Soc. Am.
108
,
1397
-
1407
.
Svec
J. G.
,
Sram
F.
,
Schutte
H. K.
(
2007
).
Videokymography in voice disorders: what to look for?
Ann. Otol. Rhinol. Laryngol.
116
,
172
-
180
.
Titze
I. R.
(
1988
).
The physics of small-amplitude oscillation of the vocal folds
.
J. Acoust. Soc. Am.
83
,
1536
-
1552
.
Titze
I. R.
(
1994
).
Mechanical stress in phonation
.
J. Voice
8
,
99
-
105
.
Titze
I. R.
(
2000
).
Principles of Voice Production
.
Iowa City, IA
:
National Center for Voice and Speech
.
Titze
I. R.
(
2006
).
The Myoelastic Aerodynamic Theory of Phonation
.
Denver, CO
:
National Center for Voice and Speech
.
Titze
I. R.
,
Baken
R. J.
,
Herzel
H.
(
1993a
).
Evidence of chaos in vocal fold vibration
. In
Vocal Fold Physiology: Frontiers in Basic Science
, (ed.
Titze
I. R.
), pp.
143
-
188
.
San Diego, CA
:
Singular Publishing Group
.
Titze
I. R.
,
Jiang
J. J.
,
Hsiao
T. Y.
(
1993b
).
Measurement of mucosal wave propagation and vertical phase difference in vocal fold vibration
.
Ann. Otol. Rhinol. Laryngol.
102
,
58
-
63
.
Titze
I. R.
,
Fitch
W. T.
,
Hunter
E. J.
,
Alipour
F.
,
Montequin
D.
,
Armstrong
D. L.
,
McGee
J. A.
,
Walsh
E. J.
(
2010
).
Vocal power and pressure-flow relationships in excised tiger larynges
.
J. Exp. Biol.
213
,
3866
-
3873
.
Tokuda
I. T.
,
Horácek
J.
,
Svec
J. G.
,
Herzel
H.
(
2008
).
Bifurcations and chaos in register transitions of excised larynx experiments
.
Chaos
18
,
013102
.
Van Den Berg
J.
(
1958
).
Myoelastic-aerodynamic theory of voice production
.
J. Speech Hear. Res.
1
,
227
-
244
.
von Glass
W.
,
Pech
H.-J.
(
1983
).
Zum Ossifikationsprinzip des Kehlkopfskelets von Mensch und Säugetieren. Vergleichende anatomische Untersuchungen. [Ossification principle of the laryngeal skeleton of the human and mammals. Comparative anatomic studies]
.
Acta Anat. (Basel)
116
,
158
-
167
.
Wittenberg
T.
,
Tigges
M.
,
Mergell
P.
,
Eysholdt
U.
(
2000
).
Functional imaging of vocal fold vibration: digital multislice high-speed kymography
.
J. Voice
14
,
422
-
442
.
Zemlin
W.
(
1988
).
Speech and Hearing Science. Anatomy and Physiology
.
Englewood Cliffs, NJ
:
Prentice Hall
.

COMPETING INTERESTS

No competing interests declared.

Supplementary information