Sensory systems function most efficiently when processing natural stimuli, such as vocalizations, and it is thought that this reflects evolutionary adaptation. Among the best-described examples of evolutionary adaptation in the auditory system are the frequent matches between spectral tuning in both the peripheral and central auditory systems of anurans (frogs and toads) and the frequency spectra of conspecific calls. Tuning to the temporal properties of conspecific calls is less well established, and in anurans has so far been documented only in the central auditory system. Using auditory-evoked potentials, we asked whether there are species-specific or sex-specific adaptations of the auditory systems of gray treefrogs (Hyla chrysoscelis) and green treefrogs (H. cinerea) to the temporal modulations present in conspecific calls. Modulation rate transfer functions (MRTFs) constructed from auditory steady-state responses revealed that each species was more sensitive than the other to the modulation rates typical of conspecific advertisement calls. In addition, auditory brainstem responses (ABRs) to paired clicks indicated relatively better temporal resolution in green treefrogs, which could represent an adaptation to the faster modulation rates present in the calls of this species. MRTFs and recovery of ABRs to paired clicks were generally similar between the sexes, and we found no evidence that males were more sensitive than females to the temporal modulation patterns characteristic of the aggressive calls used in male–male competition. Together, our results suggest that efficient processing of the temporal properties of behaviorally relevant sounds begins at potentially very early stages of the anuran auditory system that include the periphery.
A prominent hypothesis in systems neuroscience is that sensory systems are most efficient when processing natural stimuli (Atick, 1992; Barlow, 1961; van Hateren, 1992; Laughlin, 1981; Simoncelli and Olshausen, 2001). This efficiency reduces energy and resource expenditure associated with sensory processing. Auditory systems appear well adapted to process the spectral and temporal features of natural sounds, such as speech and other communication signals (Rieke et al., 1995; Singh and Theunissen, 2003; Smith and Lewicki, 2006; Suga, 1989; Woolley et al., 2005). Often such adaptations manifest as selectivity for behaviorally relevant sounds, which helps increase detectability of signals relative to background noise (Machens et al., 2005; Rieke et al., 1995). For example, the spectro-temporal tuning of neurons in the midbrain and forebrain of songbirds facilitates discrimination between conspecific songs, while limiting interference from modulations inherent in sounds that are less behaviorally relevant (Woolley et al., 2005).
Research on anuran amphibians (frogs and toads) yielded some of the first examples of auditory adaptations to natural sounds (Capranica and Moffat, 1975; Frishkopf et al., 1968; Mudry et al., 1977; Narins and Capranica, 1976). In most anuran species, males have repertoires of calls that are used for mate attraction and resource defense. In the auditory periphery, one or both of the two inner ear sensory papillae for detecting airborne sound − the amphibian papilla (AP) and the basilar papilla (BP) − and their afferents are predominantly tuned to the acoustic frequencies that are emphasized in conspecific calls (Capranica and Moffat, 1983; Frishkopf et al., 1968; Narins and Capranica, 1980; Ryan et al., 1992). Neurons in the central auditory system are also predominantly tuned to acoustic frequencies in conspecific calls, with some combination-sensitive neurons firing only when multiple frequencies from conspecific calls are present (Fuzessery and Feng, 1982, 1983; Hall, 1994; Megela, 1983; Mudry and Capranica, 1987a,b; Mudry et al., 1977). This matched spectral filtering (Capranica and Moffat, 1983; Simmons, 2013) by both the peripheral and central nervous systems represents an evolutionary adaptation that facilitates coding of the frequency spectra of vocalizations, which are especially important natural stimuli for frogs.
In addition to spectral properties, temporal properties of anuran calls are also crucial for species and call recognition, and for intraspecific discrimination (Castellano and Rosso, 2006; Gerhardt, 1978; Gerhardt and Doherty, 1988; Rose and Brenowitz, 2002; Schwartz, 1987; Walkowiak and Brzoska, 1982). There is evidence for the operation of matched temporal filters in the central auditory system, but less so in the periphery (Rose and Gooler, 2007; Simmons, 2013). In the central auditory system, neurons exhibit preferences for specific temporal properties of calls, such as the rate of pulses or amplitude modulation (AM) (Diekamp and Gerhardt, 1995; Eggermont, 1990; Gooler and Feng, 1992; Walkowiak, 1984), inter-pulse interval (Alder and Rose, 1998; Edwards et al., 2002) and duration (Condon et al., 1991; Gooler and Feng, 1992; Narins and Capranica, 1980; Penna et al., 1997) using rate codes. In the case of AM, the distributions of AM rates preferred by neurons in the central auditory system are often centered near the pulse rates or modulation rates that are characteristic of conspecific calls, suggesting specialization for the neural encoding of the temporal patterns present in conspecific signals (Diekamp and Gerhardt, 1995; Penna et al., 2001; Rose and Capranica, 1984, 1985; Rose et al., 1985). In contrast to the rate code common in central auditory neurons, auditory nerve fibers encode temporal properties in the timing of their impulses. For example, nerve fibers use a periodicity code to encode AM by phase-locking, or discharging at a particular phase of the modulation cycle (Dunia and Narins, 1989; Feng et al., 1991; Rose and Capranica, 1985). The ability of auditory nerve fibers to phase-lock to AM tends to decrease as a function of increasing modulation rate (Dunia and Narins, 1989; Feng et al., 1991; Rose and Capranica, 1985). Although several studies have verified the ability of anuran auditory nerve fibers to phase-lock to temporal modulations in the amplitude envelopes of conspecific signals (Capranica and Moffat, 1975; Frishkopf et al., 1968; Klump et al., 2004; Schwartz and Simmons, 1990; Simmons et al., 1992, 1993), there is so far little evidence for enhanced peripheral selectivity favoring the temporal modulations that are typical of conspecific calls.
The broad aim of this comparative study was to investigate features of temporal processing by the peripheral auditory system that might reflect adaptations for encoding temporal modulations present in conspecific vocalizations. We conducted our experiments using two well-studied frogs, Cope's gray treefrog (Hyla chrysoscelis) and the green treefrog (H. cinerea) (Bee, 2012, 2015; Gerhardt, 1982, 2001; Gerhardt and Huber, 2002). The advertisement calls that males of each species produce differ in both spectral and temporal properties (see Fig. 1). The advertisement call of gray treefrogs comprises a series of short (e.g. 10 ms), temporally discrete pulses delivered at species-specific rates of about 40 to 65 pulses s−1 (Ward et al., 2013). Pulses have energy at frequencies of about 1.25 kHz and 2.5 kHz, with the lower frequency peak attenuated by about 11 dB relative to the higher peak (Ward et al., 2013). By contrast, the advertisement call of the green treefrog consists of a single biphasic note (120–200 ms; Gerhardt, 1974a) with an initial pulsed phase and pulse rates ranging between about 100 to 200 pulses s−1, followed by an un-pulsed phase with marked waveform periodicity of about 300 Hz (typically ranging from about 200 to 400 Hz) (Oldham and Gerhardt, 1975). These calls contain spectral peaks of approximately equivalent amplitude, with one near 0.9 kHz and a second broader peak between about 2.5 and 3.6 kHz (Gerhardt, 1974a).
In addition to advertisement calls, males of both species also use aggressive calls in disputes with other males over possession of calling sites. The aggressive calls of gray treefrogs exhibit AM in the range of 50 to 100 Hz, though they typically lack the distinct pulsatile structure of advertisement calls (M. S. Reichert, personal communication; Reichert and Gerhardt, 2014). The aggressive calls of green treefrogs are similar to their advertisement calls, but are pulsed throughout at rates near 50 pulses s−1 (ranging between 39 and 56 pulses s−1) (Oldham and Gerhardt, 1975). Female treefrogs strongly prefer advertisement calls to aggressive calls (Brenowitz and Rose, 1999; Marshall et al., 2003; Oldham and Gerhardt, 1975; Schwartz, 1986, 1987; Wells and Bard, 1987). Given their importance in male–male competition for calling sites, aggressive calls are likely more behaviorally salient to males than females.
We investigated temporal processing using auditory evoked potentials (AEPs). AEPs measure neural activity from the auditory nerve and brainstem in response to acoustic stimuli, and they are a common tool for studying auditory processing in humans and other animals (Brittan-Powell et al., 2010a,b; Gall et al., 2013; Hall, 2007; Henry and Lucas, 2008; Higgs et al., 2002; Katbamna et al., 1992; Kenyon et al., 1998; Ladich and Fay, 2013; Popov and Supin, 1990; Supin et al., 1993). We used two well-established AEP techniques that have been used previously to investigate temporal processing: the auditory steady-state response (ASSR) evoked by AM tones and the auditory brainstem response (ABR) evoked by paired acoustic clicks (Burkard and Deegan, 1984; Dolphin and Mountain, 1992; Gall et al., 2013; Henry and Lucas, 2008; Mann et al., 2005; Purcell et al., 2004; Wysocki and Ladich, 2005). The magnitude of the ASSR reflects the degree of neural synchronization to AM in the signal, and thus the ASSR measures the ability of the auditory system to track temporal fluctuations in amplitude (Dolphin and Mountain, 1992; Gall et al., 2012; Mann et al., 2005). ASSR magnitude can be plotted as a function of AM to generate modulation rate transfer functions (MRTFs) (Fig. 2). MRTFs typically have an overall low-pass shape consistent with phase-locking in the auditory nerve (Dolphin and Mountain, 1992; Dolphin et al., 1994, 1995; Finneran et al., 2007; Gall et al., 2012). In the present study, we recorded ASSRs in response to tones of three different carrier frequencies modulated at AM rates between 12.5 Hz and 800 Hz (in one-octave steps). For each species, the specific carrier frequencies (denoted low, middle or high, in reference to their relative frequencies) were selected based on the species-specific tuning of the AP and BP in our two study species. We recorded ABRs in response to paired clicks, in which the time between the clicks (inter-click interval, ICI) varied between trials (Fig. 3). This double-click procedure measures the ability of the auditory system to resolve two sounds in close temporal proximity (Burkard and Deegan, 1984; Henry et al., 2011; Supin and Popov, 1995a; Wysocki and Ladich, 2002). We focused these analyses on the first peak of the ABR (P1; Fig. 4), which is thought to be generated by the auditory nerve (Achor and Starr, 1980; Buchwald and Huang, 1975; Seaman, 1991). Performance was measured in terms of percentage recovery, which we calculated as the amplitude (as in Fig. 4) of the response to the second click in a pair as a percentage of the amplitude of the response to a single click. Additionally, we calculated the minimum resolvable ICI at which a response to the second click could be detected.
We used data from ASSR and ABR recordings to test two hypotheses related to the temporal processing of natural sounds. The species-specific adaptation hypothesis holds that the auditory system is specialized to process temporal features characteristic of conspecific advertisement calls compared with those more typical of heterospecific calls. We based this hypothesis on species-differences between advertisement calls for two reasons. Advertisement calls are by far the most common vocalization produced by males of both species. These signals are used in both mate attraction and call site defense and, thus, are behaviorally relevant to both sexes (Garton and Brandon, 1975; Ritke and Semlitsch, 1991; Wells, 1977). Our comparative approach allowed us to make the following prediction: gray treefrogs should have relatively larger ASSRs than green treefrogs at the relatively slower modulation rates (e.g. between 25 and 100 Hz) near the pulse rates of gray treefrog advertisement calls, whereas green treefrogs should have relatively larger ASSRs than gray treefrogs to stimuli with relatively faster modulation rates close to those typical of the faster modulations present in green treefrog advertisement calls (e.g. between 100 and 400 Hz). These species differences should be reflected in a species×modulation rate interaction in analyses of MRTFs. We also predicted that, in ABRs evoked by paired clicks, green treefrogs would show faster recovery of responses to the second click and shorter minimum resolvable ICIs than gray treefrogs, because tracking the faster modulation rates in the green treefrog advertisement call should require greater temporal resolution.
The sex-specific adaptation hypothesis holds that males should exhibit greater selectivity than females for the temporal features of conspecific aggressive calls. This hypothesis follows from the inference that aggressive calls, which are used in male–male interactions, are more behaviorally salient to males than females. According to this hypothesis, we predicted that a species×sex×modulation rate interaction would influence the shape of MRTFs. In gray treefrogs, the temporal modulations present in aggressive calls (50–100 pulses s−1) are slightly faster than those in advertisement calls (40–65 pulses s−1); therefore, we predicted MRTFs for male gray treefrogs would be skewed toward faster modulation rates than those of conspecific females (i.e. relatively larger ASSRs between 50 and 100 Hz in males). In response to paired clicks, we also predicted that male gray treefrogs, compared with conspecific females, would have faster ABR recovery and shorter minimum resolvable ICIs. In green treefrogs, aggressive calls exhibit temporal modulations at rates between 39 and 56 pulses s−1; therefore, we predicted male green treefrogs would have greater ASSRs than conspecific females at modulation rates near 50 Hz. Given that the modulations in advertisement calls of green treefrogs are actually faster than those in male–male aggressive calls, we predicted either no sex difference or relatively faster ABR recovery and shorter minimum resolvable ICIs in females than males in this species.
Species-specific adaptation hypothesis
As in many other animals, the MRTFs for both treefrog species decreased as modulation rate increased (Fig. 5). The effect of modulation rate was significant, and it also had a large effect size compared with the other effects (Table 1). There was no significant main effect of species; however, the species×modulation rate interaction was significant (Table 1). The effects of this interaction can be seen in that each species had larger ASSRs than the other at modulation rates typical of conspecific calls, a result consistent with our predictions. For example, at modulation rates of 25–100 Hz, gray treefrogs had significantly larger responses than green treefrogs when stimuli had the highest carrier frequency (Table 2; Fig. 5). By contrast, green treefrogs had larger ASSRs than gray treefrogs at higher modulation rates (e.g. 200 and 400 Hz) for most carrier frequencies (Fig. 5). The difference was significant for responses to modulation rates of 200 Hz at all carrier frequencies (Table 2).
Recovery increased as a function of increasing ICI (Table 3), and these functions were overall very similar in shape between the two species (Fig. 6A). There was no significant effect of species on recovery, nor were there significant effects of any of the interactions involving species (Table 3). This result was inconsistent with our predictions. On average, however, green treefrogs were able to resolve slightly shorter ICIs than gray treefrogs (F1,61=5.7, P=0.020, partial η2=0.09), a result that was consistent with our prediction. The average minimum resolvable ICI was (mean±s.e.m.) 1.6±0.1 ms for green treefrogs and 2.0±0.1 ms for gray treefrogs.
Sex-specific adaptation hypothesis
Overall, MRTFs were similar between the sexes in both gray treefrogs (Fig. 7A) and green treefrogs (Fig. 7B). In contrast to our predictions, the species×sex×modulation rate interaction was not significant (Table 1). Hence, there was no evidence of larger responses in male gray treefrogs than female gray treefrogs at modulation rates of 50 and 100 Hz, nor did male green treefrogs have larger responses than conspecific females at modulation rates of 50 Hz. There was, however, a significant sex×modulation rate×carrier frequency interaction (Table 1). In response to the middle carrier frequency, females of both species consistently had larger ASSRs than males, a difference that reached significance in response to stimuli with modulation rates between 50 and 400 Hz (Table 2; Fig. 7). Responses to stimuli at the middle carrier frequency overall tended to be larger for females and smaller for males than corresponding responses to stimuli with the low or high carrier frequency.
Inconsistent with our predictions, recovery functions differed little between the two sexes (Fig. 6B). Subject sex did not have a significant effect on percentage recovery, and the interaction of sex with ICI was also not significant (Table 3). There was no sex difference in minimum resolvable ICI (F1,61=0.5, P=0.469, partial η2=0.01), nor was there an interaction between species and sex (F1,61=0.2, P=0.666, partial η2<0.01).
Our results provide robust support for the species-specific adaptation hypothesis, and no support for the sex-specific adaptation hypothesis. The key to uncovering evidence supporting the species-specific adaptation hypothesis was our comparisons of two species that have calls with quite different temporal structures (see Fig. 1). Cope's gray treefrogs have pulsatile advertisement calls, with pulse rates ranging between 40 and 65 pulses s−1, whereas the advertisement calls of green treefrogs exhibit temporal modulation at higher rates between 100 pulses s−1 and 400 cycles s−1. Although the aggressive calls of gray treefrogs are modulated at faster rates than their advertisement calls, these modulations are slower than the fastest rates in the advertisement and aggressive calls of green treefrogs. At low modulation rates (e.g. near 50 Hz), gray treefrogs tended to have relatively larger ASSRs than green treefrogs, especially in response to stimuli with the highest carrier frequency. In contrast to gray treefrogs, green treefrogs tended to have relatively larger ASSRs at higher modulation rates (e.g. 200 Hz), a result consistent across carrier frequencies. Green treefrogs also had relatively shorter minimum resolvable ICIs compared with gray treefrogs. Together, these results based on ASSRs and ABRs suggest that gray treefrogs are adapted to process the relatively slower modulation rates found in their calls, whereas green treefrogs are adapted to tracking the relatively faster pulse rates and periodicities in their calls.
Species differences in ASSR amplitudes depended on carrier frequency in gray treefrogs, but not in green treefrogs, as indicated by the species×modulation rate×carrier frequency interaction. This finding is noteworthy, as it suggests that specializations in temporal processing are related to differences in how the spectral properties of the vocalizations of the two species are transduced. Recall that in both species, each of the two spectral peaks in the advertisement call is primarily transduced by one of the two inner ear sensory papillae (the AP or the BP) (Buerkle et al., 2014; Capranica and Moffat, 1983; Gerhardt, 1974c; Hillery, 1984; Schrode et al., 2014). In gray treefrog advertisement calls, the relative amplitude of the high spectral peak (2.5 kHz) is approximately 11 dB greater than the low spectral peak (1.25 kHz; Ward et al., 2013). Hence, the majority of the acoustic energy in gray treefrog calls falls in the frequency range of the BP. This is relevant because gray treefrogs were better than green treefrogs at processing the slower modulation rates typical of gray treefrog advertisement calls only at the high carrier frequencies transduced primarily by the BP. In contrast to gray treefrogs, the low and high spectral peaks of green treefrog calls have comparable relative amplitudes; therefore, both the AP and the BP transduce prominent spectral peaks in green treefrog calls. Green treefrogs were relatively better than gray treefrogs at processing the faster modulation rates typical of green treefrog advertisement calls at both the low and the high carrier frequencies transduced primarily by the AP and BP, respectively. Our data, therefore, suggest that species-specific adaptations in temporal processing may be closely tied to potential species differences in the roles of the two sensory papillae in processing spectral information in conspecific vocalizations. At present, the specific mechanism underlying the species×modulation rate×carrier frequency interaction remains unknown and should be investigated further in future studies.
The species-specific adaptations identified in the present study are consistent with the idea that sensory systems are specially adapted to process the temporal patterns of common or behaviorally important natural stimuli. Adaptation of sensory systems to salient stimuli can improve efficiency and accuracy of neural processing (Atick, 1992; Barlow, 1961; van Hateren, 1992; Laughlin, 1981; Simoncelli and Olshausen, 2001). In the case of treefrogs communicating acoustically in cacophonous breeding choruses, adaptation of the auditory systems to process the temporal patterns present in conspecific calls could improve the neural encoding of these signals, while reducing masking and acoustic interference by the calls of syntopically breeding heterospecifics. These improvements could facilitate detection of and discrimination between conspecific calls, impacting both mate choice decisions by females and disputes between males over calling sites.
Species-specific adaptations of temporal processing in the auditory system have been recently identified in songbirds, another group of vocal animals. The responses of neurons in the auditory midbrains of zebra finches (Taeniopygia guttata) were found to synchronize to the temporal envelopes of SAM noise across a range of modulation rates that closely match the modulations in conspecific songs (Woolley and Casseday, 2005). The tuning of cells in the midbrain to the temporal properties of sounds is also context dependent. Temporal tuning tended to be sharper in response to conspecific song stimuli compared with behaviorally neutral noise stimuli, even when the modulations contained in the stimuli were similar (Woolley et al., 2006). Woolley et al. (2005) identified a mismatch between the best temporal tuning of neurons in both the auditory midbrain and forebrain areas, and the typical modulation rates in conspecific songs. When considering the average response of neurons in the auditory midbrain and also forebrain areas, the strength of temporal tuning increased as a function of modulation rate, whereas the power of the modulations occurring in conspecific song decreased as a function of modulation rate (Woolley et al., 2005). This pattern of tuning has the effect of attenuating common modulations and amplifying modulations that vary between conspecific songs, potentially increasing the discriminability of songs (Woolley et al., 2005).
An important implication of the results of the present study is that species-specific adaptations in temporal processing might occur as early as the auditory periphery in frogs. In both gray treefrogs and green treefrogs, MRTFs based on ASSR magnitudes were nearly log-linear with respect to modulation rate, with responses decreasing as a function of increasing modulation rate. The ASSR is a measure of neural synchronization, with a strong component originating in auditory nerve fibers (Henry and Lucas, 2008; Supin and Popov, 1995b). The shapes of MRTFs in this study are consistent with previous studies of auditory nerve fibers in frogs, which have also reported decreasing neural synchronization as a function of increasing modulation rate (Dunia and Narins, 1989; Feng et al., 1991; Rose and Capranica, 1985). Our data on ABRs in response to double-click stimuli also support the idea that species differences in temporal processing might arise as early as the auditory periphery. The primary generator of the first peak (P1) in the ABRs of all animals studied to date is the auditory nerve (Achor and Starr, 1980; Brown-Borg et al., 1987; Buchwald and Huang, 1975; Lev and Sohmer, 1972; Seaman, 1991). The timing of P1 of the ABR in both gray and green treefrogs corresponds well to the expected latencies of anuran auditory nerve fibers (Buerkle et al., 2014; Schrode et al., 2014). In support of this view, the minimum resolvable ICIs of between 1.5 and 2.0 ms measured in the present study are comparable with the average gap detection times of between 1.2 and 2.2 ms reported previously for anuran auditory nerve fibers (Feng et al., 1994). Our results, therefore, support the hypothesis that the well-known adaptations of the frog peripheral auditory system for processing natural sounds in the spectral domain may also extend to processing in the temporal domain.
It is potentially surprising that species-specific adaptations in temporal processing might arise in the periphery, because the peripheral auditory system is generally considered to function as a low-pass envelope filter (Carney, 1993; Dau et al., 1996; Dolphin et al., 1995; Frisina, 2001). However, adaptations of the peripheral auditory system for particular modulation rates have been identified in at least one previous study. In a comparison of three species of songbirds, Henry and Lucas (2008) found that the two species whose vocalizations contained the fastest modulations also exhibited larger ASSRs than the third species, particularly at rates faster than 950 Hz. These results suggested co-evolution of temporal resolution and temporal modulations in conspecific vocalizations. We believe the current study presents the first evidence in favor of species-specific adaptations for processing temporal patterns in conspecific signals at the level of the auditory periphery in anurans. Future comparative work using electrophysiological recordings from single auditory nerve fibers in both gray treefrogs and green treefrogs will be required to confirm this hypothesis.
Our demonstration of species differences in temporal processing also sheds important light on potential auditory mechanisms related to the so-called ‘cocktail party problem’ (McDermott, 2009). In both humans and frogs, the background noise levels characteristic of large social aggregations fluctuate in amplitude. Human listeners can take advantage of brief ‘dips’ in noise levels to catch acoustic glimpses of target signals of interest (Bacon et al., 1998; Cooke, 2006; Füllgrabe et al., 2006; Vestergaard et al., 2011). This ability is known as ‘dip listening’, and it is thought to be dependent on having auditory temporal resolution sufficient to resolve the fluctuations in the background noise (Festen, 1993; Qin and Oxenham, 2003). Recent comparative psychophysical studies of gray and green treefrogs uncovered a species difference in their abilities to recognize conspecific advertisement calls in the presence of temporally fluctuating noise (Vélez and Bee, 2010, 2011, 2013; Vélez et al., 2012). Gray treefrogs, but not green treefrogs, were able to listen in dips to achieve a release from auditory masking by chorus-like noises that fluctuated in amplitude over time. Based on this behavioral difference between the two species, we would have expected gray treefrogs to have better temporal resolution than green treefrogs. However, our results do not support this conclusion and possibly suggest precisely the opposite pattern. A relatively larger ASSR indicates greater synchrony of neural responses, an important component of temporal resolution. Across the modulation rates tested, for each species there where instances when it had larger ASSRs than the other species. However, several factors suggest that green treefrogs have better temporal resolution than gray treefrogs. Green treefrogs had larger ASSRs than gray treefrogs in response to far more stimulus conditions, and furthermore, they tended to have larger ASSRs at faster modulation rates, indicating an ability to synchronize to and to resolve faster modulation rates. Green treefrogs also had shorter minimum resolvable ICIs based on ABRs than gray treefrogs. Thus, the species differences in temporal processing that might exist at the level of the auditory periphery reported in the present study appear poorly suited to explain the differences in dip listening abilities previously described for these two species. The present study, therefore, highlights the potentially important role of central auditory processes in solving cocktail-party-like communication problems.
A final important result from this study is that it failed to uncover evidence for sex differences in temporal processing. We saw no evidence that males had relatively larger ASSRs than females at modulation rates typical of conspecific aggressive calls, nor was there evidence for a sex difference in percentage recovery functions based on ABRs. Instead, we observed frequency-dependent sex differences in which females tended to have relatively larger ASSRs than males at the middle carrier frequency. Previous behavioral studies (Gerhardt, 2005) and recordings of AEPs (Buerkle et al., 2014; Schrode et al., 2014) in treefrogs indicate that sound frequencies between the two spectral peaks of advertisement calls, and correspondingly between the two frequency regions of greatest auditory sensitivity, are able to stimulate simultaneously both auditory papillae in the anuran inner ear. The observation of larger ASSRs in females suggests better recruitment of nerve fibers across the two papillae in females than in males. This result is consistent with previous results from recordings of AEPs in these species (Buerkle et al., 2014; Schrode et al., 2014). In those studies, the amplitudes of P1 of tone-evoked ABRs were larger in females than males when tones had intermediate frequencies (1.5 to 2.0 kHz). At present, it remains unclear whether this frequency-dependent sex difference in responses is indicative of an evolutionary adaptation related to some aspect of spectral or temporal processing.
MATERIALS AND METHODS
Subjects were 68 gray treefrogs (35 female) and 59 green treefrogs (30 female). Gray treefrogs were collected from Carver Park Reserve (Carver County, MN, USA), Crow-Hassan Park Reserve (Hennepin County, MN, USA) or Lake Maria State Park (Wright County, MN, USA). Green treefrogs were collected from the East Texas Conservation Center (Jasper County, TX, USA). All frogs were collected in amplexus during their respective breeding seasons in either 2011 or 2012. Female gray treefrogs (mean±s.d.: mass=5.2±1.0 g; SVL=39.3±2.7 mm) tended to be larger than male gray treefrogs (4.2±0.8 g; 35.8±1.9 mm). In green treefrogs, females (7.4±1.5 g; 49.4±3.0 mm) and males (7.2±1.4 g; 48.0±3.2 mm) were similar in size. After collection, frogs were transported to the laboratory, where they were housed in terraria on a 12 h:12 h light:dark cycle at ambient room temperature (20±2°C). We supplied frogs with fresh water and a regular diet of vitamin-dusted crickets. We tested each subject within 3 weeks of collection. All animals were collected with permission from the Minnesota Department of Natural Resources (permits 17892 and 19061) and Texas Parks and Wildlife (permit SPR-0410-054), and treated according to protocols approved by the Institutional Animal Care and Use Committee of the University of Minnesota (1103A97192).
Equipment and procedures for recording AEPs have been described previously (Buerkle et al., 2014; Schrode et al., 2014). Briefly, we generated all digital stimuli (50 kHz sampling rate, 16-bit) in TDT SigGenRP software (Tucker Davis Technologies, Alachua, FL, USA). TDT BioSigRP software coordinated stimulus output and neural recording through TDT System 3 hardware. Stimuli were broadcast through an Orb Mod 1 speaker (Orb Audio, New York, NY, USA), which was driven by a Crown XLS 202 amplifier (Crown Audio, Elkhart, IN, USA).
Recordings were made inside a MAC-3 radio-shielded mini-acoustical chamber (W×D×H: 81.3×61×61 cm; Industrial Acoustics Company, Bronx, NY, USA). For recordings, we first immobilized subjects with an intra-muscular injection of d-tubocurarine chloride (3–12 µg g−1 body weight). Subjects were loosely wrapped in a thin piece of moistened gauze to facilitate cutaneous respiration and seated in a natural position on an acoustically transparent platform, facing the speaker. Temperature was monitored via a Miller & Weber quick-reading thermometer placed against the subject's body wall and ranged between 18 and 20°C across recording sessions. We have observed gray treefrogs in amplexus at temperatures between 14 and 23°C, and green treefrogs in amplexus at temperatures between 17 and 26°C. We placed subjects so that the rostral edges of their tympana were 30 cm from the face of the speaker. We applied a topical anesthetic (2.5% lidocaine HCl) to the scalp of the subject prior to inserting the tips of three subcutaneous electrodes (1–5 kΩ) under the skin. The recording electrode was located between the eyes and the ground and inverting electrodes were placed adjacent to the two tympana. Neural signals were sampled at a rate of 25 kHz, digitized and amplified before being transmitted via optic fiber cable to a TDT RZ5 processor and stored for offline analysis. On the rare occasion that a recording was contaminated with an obvious artifact (e.g. due to infrequent buccal pumping motion), that recording was repeated.
Auditory steady-state responses (ASSRs) to modulated tones
We generated AM tones by multiplying two sinusoids, one serving as the modulator (100% modulation depth) and the second serving as the carrier signal. Tones were modulated in one-octave steps at rates of 12.5, 25, 50, 100, 200, 400 and 800 Hz, and were of a sufficient duration to ensure that subjects heard at least 10 modulation cycles at each modulation rate. Tones with modulation rates of 12.5 Hz had a duration of 800 ms. All other tones had durations of 400 ms. We used three different carrier frequencies for each species (1.25, 1.625 and 2.5 kHz for gray treefrogs; 0.9, 1.6 and 2.7 kHz for green treefrogs). The low and high carrier frequencies for each species corresponded to frequencies prominent in conspecific advertisement calls (Gerhardt, 1974a,b; Schrode et al., 2012), and both species tend to be most sensitive to these two frequencies (Buerkle et al., 2014; Hillery, 1984; Lombard and Straughan, 1974; Miranda and Wilczynski, 2009; Penna et al., 1992; Schrode et al., 2014). The middle carrier frequency for each species was chosen because it simultaneously excites the AP and BP at high signal levels (Buerkle et al., 2014; Gerhardt, 2005; Schrode et al., 2014). In Fig. 2 we show six cycles of example stimuli used to elicit ASSRs from green treefrogs.
Calibration of signal level was a two-step process. We first calibrated 1 s (unmodulated) tones with frequencies matching the carrier frequencies of the AM tones to 70 dB SPL (re. 20 µPa, C-weighted, fast RMS), using the microphone of a Larson Davis System 824 sound level meter (Larson Davis, Depew, NY, USA) placed at the approximate location of the frog's head and facing the speaker. We then matched the peak-to-peak amplitudes of each AM tone to that of the calibrated, unmodulated tone of corresponding frequency. The frequency response of the speaker was flat (±1 dB) across the range of frequencies tested.
We recorded two ASSRs to each stimulus from 30 gray treefrogs (15 females) and 30 green treefrogs (15 females); examples from a green treefrog are shown in Fig. 2. Each ASSR consisted of the average of the responses to 400 presentations of the stimulus. We randomized carrier frequencies and modulation rates of 25–800 Hz for each subject. Because of their long duration, tones modulated at a rate of 12.5 Hz were presented in a separate block prior to or following tones modulated at other rates. The timing of the 12.5 Hz block (either before or after the other recordings) and the carrier frequency of tones within the block, were randomized for each subject. Recordings of responses to stimuli were notch filtered at 60 Hz (roll off: 18 dB per octave) to reduce electrical noise and low-pass filtered at 3 kHz (roll off: 6 dB per octave). The notch filter was wide enough to attenuate the amplitudes of recorded responses to stimuli with modulation rates of 50 Hz, but it did so equally for both species and both sexes.
In our statistical tests of the species-specific and sex-specific adaptation hypotheses, we considered evoked responses to occur only when the peak of the frequency spectrum of the response at the modulation rate of the stimulus was significantly higher than background noise. We accomplished this as follows. First, we determined the frequency spectrum of each ASSR (Fig. 2) by averaging over the two replicate responses to a given stimulus and then performing an FFT analysis (8192 point) over the first 400 ms of the response. The duration of this analysis window was chosen to achieve a frequency resolution suitable for the modulation rates tested, and it ensured inclusion of a whole number of cycles of the modulation stimulus, which is important for avoiding errors in the calculated frequency spectrum (Herdman and Stapells, 2001; John and Picton, 2000; Nachtigall et al., 2007; Supin and Popov, 1995b). Next, we computed an F ratio comparing the power at the modulation rate of the stimulus (e.g. 100 Hz in Fig. 2) with the average power in the 16 FFT bins adjacent to the modulation rate of the stimulus (Cone-Wesson et al., 2002; Dobie and Wilson, 1996; Gorga et al., 2004; Hall, 2007; Herdman and Stapells, 2001; Korczak et al., 2012; Picton et al., 2003; Purcell et al., 2004; Valdes et al., 1997; van der Reijden et al., 2005). Bins were approximately 3 Hz in width, so the background noise was estimated for a range of about 48 Hz surrounding the modulation rate of the stimulus. An evoked ASSR was considered to have occurred if the corresponding F ratio exceeded the critical value of F2,32 at α=0.05 (where the degrees of freedom in the numerator and denominator are twice the number of frequency bins used to estimate the signal and noise magnitudes, respectively). We repeated this analysis using time windows that included a fixed number of cycles of each amplitude modulation (10 cycles), rather than fixed time windows (in ms). The same pattern of results noted in the text was present in the results following this alternative analysis method, so we present only the results of the analysis with a fixed time window.
We investigated the effects of species, sex, modulation rate and carrier frequency on evoked ASSRs using a linear mixed model in R (R Development Core Team, 2014), which we fitted using the lme4 (Bates et al., 2014) and afex (Singmann, 2014) packages. Our model included species, sex, modulation rate and carrier frequency as fixed factors, all two-way interactions, and the three-way interactions of modulation rate×carrier frequency with both species and sex. We performed Tukey post hoc contrasts using the lsmeans package (Lenth, 2014) to compare between levels of different factors in the model. A significance criterion of α=0.05 was used for all analyses.
Auditory brainstem responses (ABRs) to paired clicks
Click stimuli (0.1 ms duration) output through our setup had a broadband spectrum, with a center frequency of approximately 1.6 kHz and 6 dB down points of approximately 0.345 and 2.8 kHz. Paired clicks consisted of two acoustic clicks, separated by a specified ICI. Examples are illustrated in Fig. 3A. We tested ICIs of 0.25 ms, 0.5 ms, 0.75 ms and 1–10 ms in 1 ms steps, with presentation order randomized between subjects. Each presentation of a paired-click stimulus was followed by at least 40 ms of silence and then a single-click stimulus. We recorded two replicate ABRs to the paired-click and single-click stimuli, with each replicate consisting of the average response to 1200 presentations of the stimulus. There was a silent interval of at least 40 ms between the single click and the onset of the next stimulus presentation. Click polarity was constant for all three clicks within a presentation, but alternated between each presentation to reduce the microphonic potential. Clicks were calibrated to 80 dB by matching the peak-to-peak amplitude of each click to that of a calibrated 1 s tone with a frequency of 1000 Hz.
We recorded ABRs to paired clicks from 38 gray treefrogs (20 females) and 29 green treefrogs (15 females) (see Fig. 3B). At relatively long ICIs (e.g. 8 ms), separate ABRs to each of the clicks in the paired-click stimuli were usually evident (Fig. 3B,C). However, at shorter ICIs, the ABRs evoked by the first and second clicks overlapped in time. To disambiguate these overlapping ABRs, we derived the response to the second click by aligning the responses to the single-and paired clicks at stimulus onset, and then subtracting, point-by-point, the first 25 ms of the average response to the single click from the average response to the paired click. This subtraction effectively removed any ABR evoked by the first click of the pair, leaving only the residual ABR evoked by the second click (Fig. 3C). Using a custom-written, cursor-based program in Matlab, we measured the amplitude of all residual ABRs and ABRs evoked by single clicks as the peak-to-peak amplitude from the top of the first peak (P1) to the bottom of the subsequent trough (see Fig. 4) (Buerkle et al., 2014; Schrode et al., 2014). If a peak was not visible, we considered the amplitude to be 0 µV. These values were used to calculate the percentage recovery as the ratio of the amplitude of a residual ABR to the amplitude of the ABR evoked by the corresponding single click. For each subject, we also measured the shortest resolvable ICI. After plotting residual ABRs as a function of ICI (as in Fig. 3C), we selected the minimum resolvable ICI as the shortest ICI for which an evoked residual ABR was visually detectable. We used a repeated-measures ANOVA to investigate the effects of species and sex on percentage recovery. We tested for significant differences in minimum resolvable ICI using a two-way ANOVA. Species and subject sex were included as fixed factors. We used a significance criterion of α=0.05 for both analyses and report P values corrected based on the Greenhouse–Geisser method (Greenhouse and Geisser, 1959) where applicable.
We thank Madeleine Linck, Don Pereira, Ed Quinn, Gary Calkins and Christopher Maldonado for access to study sites and collection permissions. We also thank many undergraduate students for help collecting frogs and Dylan Verden and Desiree Schaefers for technical assistance.
K.M.S. and M.A.B. conceived and designed the experiments, K.M.S. performed the experiments, and K.M.S. and M.A.B. wrote the paper.
This work was supported by grants from the National Science Foundation [IOS 0842759 to M.A.B.] and the National Institutes of Health [T32GM008471 to T.J.E.]. Deposited in PMC for release after 12 months.
The authors declare no competing or financial interests.