ABSTRACT
Functional biologists employ numerical differentiation for many purposes, including (1) estimation of maximum velocities and accelerations as measures of behavioral performance, (2) estimation of velocity and acceleration histories for biomechanical modeling, and (3) estimation of curvature, either of a structure during movement or of the path of movement itself. I used a computer simulation experiment to explore the efficacy of ten numerical differentiation algorithms to reconstruct velocities and accelerations accurately from displacement data. These algorithms include the quadratic moving regression (MR), two variants of an automated Butterworth filter (BF1–2), four variants of a method based on the signal’s power spectrum (PSA1–4), an approximation to the Wiener filter due to Kosarev and Pantos (KPF), and both a generalized cross-validatory (GCV) and predicted mean square error (MSE) quintic spline. The displacement data simulated the highly aperiodic escape responses of a rainbow trout Oncorhynchus mykiss and a Northern pike Esox lucius (published previously). I simulated the effects of video speed (60, 125, 250, 500 Hz) and magnification (0.25, 0.5, 1 and 2 screen widths per body length) on algorithmic performance. Four performance measures were compared: the per cent error of the estimated maximum velocity (Vmax) and acceleration (Amax) and the per cent root mean square error over the middle 80% of the velocity (VRMSE) and acceleration (ARMSE) profiles.
The results present a much more optimistic role for numerical differentiation than suggested previously. Overall, the two quintic spline algorithms performed best, although the rank order of the methods varied with video speed and magnification. The MSE quintic spline was extremely stable across the entire parameter space and can be generally recommended. When the MSE spline was outperformed by another algorithm, both the difference between the estimates and the errors from true values were very small. At high video speeds and low video magnification, the GCV quintic spline proved unstable. KPF and PSA2–4 performed well only at high video speeds. MR and BF1–2 methods, popular in animal locomotion studies, performed well when estimating velocities but poorly when estimating accelerations. Finally, the high variance of the estimates for some methods should be considered when choosing an algorithm.
INTRODUCTION
Functional biology has played an increasingly important role in elucidating and understanding ecological differences among organisms (Webb, 1982, 1984; Denny, 1988; Norberg, 1990; Niklas, 1992; Wainwright and Reilly, 1994; Vogel, 1994). Many ecologically relevant functional analyses require the estimation of first and second derivatives using numerical differentiation. For example, some measures of functional performance (e.g. maximum velocity, acceleration and turning curvature) for behaviors important to fitness (e.g. feeding strikes, escape responses, maneuvering) are estimated using first and second derivatives of a discretely sampled function. Additionally, accurate estimates of the entire history of velocity, acceleration and turning curvature, over some discrete amount of time, are critical for investigating biomechanical models of animal movements that occur during feeding and locomotion. Robust biomechanical models of animal motion can lead to predictions of performance and, ultimately, ecological variation.
Numerical differentiation necessarily suffers from the magnification of small errors introduced during the digitizing process. Smoothing and filtering are two distinct, but related, classes of methods for estimating and removing this error. These two classes reflect different ‘ways of seeing’ measurement error. In smoothing, the estimated error is removed by fitting a function that follows the central trend of the data and replacing the observed values with the expected or ‘smoothed’ values. This method is probably most intuitive to biologists who are familiar with errors around trend lines in many other biological problems. In filtering, the trend, or signal, is modeled as a high-amplitude, low-frequency component of the data, and the error, or noise, is removed by filtering out the low-amplitude, high-frequency components from the signal. Filtering is probably most intuitive to engineers who are familiar with noisy signals in many different systems.
Numerous smoothing and filtering algorithms are available, but few objective reasons have been offered to choose one algorithm over another. This lack of guidance is unfortunate because, while most of these algorithms produce similar first and second derivatives for data with trivial levels of digitizing error, they produce widely divergent derivative estimates for large and more realistic levels of error. Many biologists investigating (nonhuman) animal movement have shown little concern for the accuracy of the numerical differentiation algorithms employed (but see Rayner and Aldridge, 1985; Harper and Blake, 1989). By contrast, numerical differentiation algorithms have been intensively developed and evaluated in the human movement sciences (Hatze, 1981; Wood, 1982; Woltring, 1985, 1986a; D’Amico and Ferrigno, 1992; Corradini et al. 1993; Gazzani, 1994; Giakas and Baltzopoulos, 1997a). Unfortunately, the performance of numerical differentiators in the human movement sciences has been limited to data sampled at relatively low frequencies (50–100 Hz). Because accuracy in numerical differentiation varies with sampling frequency (Harper and Blake, 1989), animal biologists using high-speed video should exercise caution when extrapolating results from human movement studies.
In this study, a computer simulation experiment is used to compare the performance of ten numerical differentiation algorithms and to explore the effects of video magnification and video speed on algorithmic performance. Algorithmic performances were evaluated by comparing estimated velocities and accelerations with reference values from a function known a priori. I focused on two aspects of algorithmic performance: (1) the accuracy of an algorithm’s reconstruction of velocities and accelerations throughout the duration of a movement (which may be arbitrarily defined) and (2) the accuracy of estimated maximum velocities and accelerations.
MATERIALS AND METHODS
Generation of simulated data
To create reference velocity and acceleration histories with known values, I digitized the acceleration profiles illustrated in Figs 3–7 of Harper and Blake (1990). These data were from two escape responses of a rainbow trout Oncorhynchus mykiss and three escape responses of a Northern pike Esox lucius and were originally measured with an accelerometer surgically implanted into the body of the individual (Harper and Blake, 1990). The figures were digitized with between 93 and 132 unequally spaced points. For each raw trace, I used a linear regression between each pair of adjacent points as an interpolator and divided the measured function into 8001 equally spaced points. At a sampling rate of 60 Hz, 5–6 points would be sampled from the reference data, which is below the minimum number of points for some of the algorithms. I therefore added enough points with zero acceleration to the beginning of the reference sequences to allow an additional two points when sampled at 60 Hz. Because the same reference sequences were used for all simulations, this addition increased the number of points sampled at 125, 250 and 500 Hz by 4, 8 and 16, respectively. The total numbers of points, Nref, in the reference profiles for Figs 3–7 of Harper and Blake (1990) were 10 199, 10 024, 10 527, 10 199 and 10 077, respectively. The reference figures are illustrated in Fig. 1. I generated the five reference displacement profiles by doubly integrating the acceleration data using:
To generate a sample of estimates within each VM×VS combination, I sampled the reference sequence 100 times, each time starting the sampling from a random time and adding one pixel error of random sign to one-third of the sampled values. With each simulated sequence, I estimated instantaneous velocities and accelerations using the methods described below and converted these estimates back into the original units (m s−2) using the inverse of equation 4.
Numerical differentiation algorithms compared (Table 1) Moving regression (MR)
Given a vector of N sampled positions, yi, measured every Δt seconds, derivatives are commonly estimated by backwards, forwards or first central differences:
Automated bidirectional second-order Butterworth digital filter (BF)
I used the bidirectional second-order Butterworth filter described in Winter (1990), but employed a fully automated method for choosing the optimal cut-off frequency, fc. Winter (1990) developed a semi-automatic method to find fc from the regression of root mean square error (RMSE), measured across a range of cut-off frequencies, against these cut-off frequencies. In this residual analysis (RA), the RMSE is the error from the unfiltered function, not the true function, which is unknown. A problem with implementing this as a generalized ‘black-box’ procedure is that it is unclear which frequencies to include in the linear regression.
Instead of using Winter’s (1990) approach, I employed an autocorrelation analysis of the residuals and used two different criteria for finding the optimal cut-off frequency. For each potential cut-off frequency between 0 and 0.5fs (where fs is the sampling frequency), taken at 0.1 Hz intervals, I computed the autocorrelation function of the N residuals. The sum of the squared autocorrelations (SSRA) at each lag, normalized by the autocorrelation at lag zero, was used as a measure of the residual autocorrelation. The first optimal fc was chosen as the value resulting in the minimum SSRA (Cappello et al. 1996).
For the second estimate of the optimal fc, I compared the observed SSRA at each potential fc with the upper ninetieth percentile of SSRAs computed from N random normal variates. Any set of residuals with an observed SSRA below this ninetieth percentile is considered not significantly different from a random time series. The optimal fc was chosen as the lowest value that resulted in a ‘random’ SSRA. I refer to the minimization method as BF1 and the comparison with a random time series method as BF2.
For both variants of the optimal Butterworth filter, I first extrapolated the data using a quadratic polynomial. At each tail of the series, I used a quadratic polynomial through seven (if N⩾ 10) or five (if N<10) points and added N/2 points at equal intervals using the quadratic coefficients. This point extrapolation has been shown to improve the efficacy of a filtered function and its derivatives (D’Amico and Ferrigno, 1990; Smith, 1989). Smith (1989) found that a linear extrapolation method was the best of several alternatives (but not including a quadratic polynomial method). In my exploration of several extrapolation algorithms, I found a quadratic polynomial extrapolation performed better than a linear extrapolation.
Power spectrum analysis (PSA)
Kosarev–Pantos filter (KPF)
Gazzani (1994) described the Kosarev–Pantos approximation to the Wiener filter, which estimates the optimal filter coefficients from the distribution of the Fourier-transformed data. I implemented the KPF routine exactly as listed in the appendix of Gazzani (1994). In addition, I odd-extended the data in order to create a pseudoperiodic time series (Gazzani, 1994).
Generalized cross-validatory (GCV) and predicted mean squared error (MSE) quintic spline
All algorithms were written in either Pascal or Fortran77 (LS Fortran Plug-in for Metrowerks Code Warrior, Fortner Research Inc.) and compiled using CodeWarrior Academic Gold 11 (Metrowerks Inc.) for the Power Macintosh. All of these methods are available in a single software program, QuickSAND (Quick Smoothing and Numerical Differentiation) for the Power Macintosh and clones running MacOS (Walker, 1997).
RESULTS
Global performance
The distributions of VRMSE, Vmax, ARMSE and Amax for each algorithm, pooled across sample, VM and VS, are summarized using box plots (Fig. 2). In these plots, the median is a measure of a method’s tendency, while the error variation (the variance of the errors around zero and not around the distribution’s mean or median) is a measure of a method’s robustness or ability to avoid large errors. Differences in the pooled distributions among algorithms were relatively small for the velocity but large for the acceleration measures. For VRMSE, the median estimates differed trivially but the distributions of BF2, PSA1 and KPF were positively skewed, indicating less robustness. In contrast, GCV and, especially, MSE were the most robust algorithms for estimates of the entire velocity history over the range of VM and VS simulated here. For Vmax, the MR, BF1, PSA4, GCV and MSE algorithms had generally symmetrical distributions of small variance with slightly positive medians (<5%). The negatively skewed distributions of BF2, PSA1–3 and KPF indicate a tendency to underestimate Vmax. Again, the two spline algorithms were the most robust. As for the velocity estimates, the pooled distributions suggest that MSE and, to a lesser degree, GCV had a better global performance for acceleration estimates than did the other algorithms. For ARMSE, the MSE distribution not only had the median with the smallest per cent error but also had the least positive skew, indicating its robustness. In contrast, the distribution of GCV had a relatively small median error but was highly positively skewed. Other methods that were prone to large errors were BF1, BF2, PSA1 and KPF. The distribution of Amax indicates that all methods tend to underestimate maximum accelerations. The three methods with the smallest median per cent error, BF1, PSA1 and GCV, were associated with highly positively skewed distributions. In contrast, the PSA2, PSA3, PSA4 and, especially, MSE algorithms had moderate to relatively large median errors but were far less likely to result in grossly incorrect estimates of maximum acceleration.
Local performance: effects of video magnification and speed
The global error distributions (Fig. 2) fail to highlight the complex interactions between algorithm, VM and VS on algorithmic performance (Figs 3–6). The major features of the distributions of the performance measures within each VM×VS combination (Figs 3–6) are (1) the marked decrease in the magnitude of both the median error and the error variance from 60 to 250 Hz and the slight (high VM) to large (low VM) increase in error from 250 to 500 Hz (especially conspicuous for the acceleration data), (2) the decrease in error with increasing VM, especially at high VS, and (3) the decrease in variance among methods from 60 to 250 Hz but the increase from 250 to 500 Hz, a pattern that was particularly conspicuous at low VM. There were exceptions to 1: throughout the increase in VS, both the maximum and RMSE performances of MSE and KPF improved, the RMSE performances of PSA1–4 improved and the acceleration performances of GCV worsened.
For VRMSE (Fig. 3), the two spline methods were clearly superior at 60 Hz. At 125 Hz, the MR, BF2 and, especially, PSA1 and KPF algorithms performed the worst at the highest VM, but differences among methods became increasingly trivial as VM decreased. In contrast, for both 250 and 500 Hz, differences among algorithms were trivial at the highest VM but were increasingly large as VM decreased. At 500 Hz and 0.25×?magnification, the MR, BF1 and GCV algorithms performed the worst. The general patterns occurring in the distributions of VRMSE also occurred in the distributions of Vmax (Fig. 4). At 60 and 125 Hz, the two spline algorithms had both the smallest?median errors and the smallest error variances, especially at high VM. The KPF method performed poorly at low VS but the best at high VS, particularly at high VM.
For both ARMSE (Fig. 5) and Amax (Fig. 6), the two spline algorithms were clearly superior at 60 Hz, especially at high VM. While MSE continued to have the best performance with increasing VS for ARMSE, the KPF algorithm had the best performance for Amax at high VS. PSA1–4 had relatively good performance at low VM for both ARMSE and Amax. The low sharpening factor of PSA1 resulted in a highly unstable performance for both acceleration measures at 60 and 125 Hz but good performance, relative to the other PSA variants, for estimates of Amax (but not ARMSE) at 250 and 500 Hz. The MR method performed relatively well for estimates of both maximum acceleration and the entire acceleration profile at high VM and VS, and at high VM and moderate VS, but poorly at either low VS or at high VS and low VM. For ARMSE, BF1 performed better than BF2 at 60 and 125 Hz but worse at 250 and 500 Hz. For Amax, BF1 performed better than BF2 at 60, 125 and 250 Hz. The better of BF1 and BF2 performed well, relative to the other algorithms, at 250 Hz but very poorly at lower and higher video speeds.
DISCUSSION
Computer simulation experiments have proved to be a powerful means of evaluating alternative numerical methods over a specified parameter space (Rohlf et al. 1990; Oden and Sokal, 1992; Martins and Garland, 1991). The comparison of numerical differentiation algorithms has a relatively long history but has generally been limited to qualitative comparisons of only a few model sequences (Zernicke et al. 1976; Pezzack et al. 1977; McLaughlin et al. 1977; Wood and Jennings, 1979; Cappozzo and Gazzani, 1983). Two recent simulation studies that have compared the performance of several numerical differentiation algorithms (Corradini et al. 1993; Giakas and Baltzopoulos, 1997a) are of limited use to anyone using high-speed video because of the low (50–100 Hz) sampling frequency. Additionally, while Giakas and Baltzopoulos (1997a) compared the performance for a large number of signals, they reported only the relative rankings among the algorithms and not the actual or per cent errors. The simulation in the present study, which compared the performance of ten algorithms across different combinations of video magnification and speed, was designed to lend some guidance to the choice of numerical differentiation algorithm for biologists investigating animal movement.
Before discussing the results, I want to emphasize several caveats of this analysis. (1) The simulated behavior was highly aperiodic and non-stationary, which may bias the results in favor of methods that do not assume a periodic or stationary signal (Woltring, 1985). Nevertheless, the relatively good performance of the PSA and KPF algorithms at high VS suggests that detrending and, if necessary, odd-extension are adequate for these data. Additionally, a comprehensive simulation of periodic signals should be undertaken to evaluate the performance of the spline methods with periodic data. (2) The optimal VS for an algorithm should not be considered absolute. Instead, the important comparative parameter should be some measure of the frequency of the signal’s inflection points, for the derivative order of interest, relative to sampling frequency. (3) With the exception of the MSE algorithm, I have intentionally treated the process of numerical differentiation as a fully automatic methodology with no input of either subjective or objective external knowledge (the MSE algorithm differs from the others by employing an estimate of the measurement error). The wisdom of this approach is debatable, but these results do show that good estimates can occur in complete ignorance of either the processes generating the signal or the mechanics of the algorithms. (4) Woltring (1991) suggested that the optimal smoothing parameter may differ for derivatives of different order and that more smoothing may be necessary for higher-order derivatives (see also Hatze, 1981; Giakas and Baltzopoulos, 1997b). The smoothness of the derivatives investigated in the present study were determined by the measured function only (zeroth-order derivative). (5) Finally, this study explicitly simulated data derived from video which, in general, has far lower resolution than high-speed film. Film-derived data should therefore have smaller expected errors than video-derived data at comparable filming magnification and speed.
Comparative performance
The simulations reported here suggest that no method performs best across all regions of the VM×VS parameter space. This conclusion is not surprising and supports the findings of analyses that simulated data sampled at lower frequencies than those reported here (Corradini et al. 1993; Giakas and Baltzopoulos, 1997a). Because of the variation in rank-order performance with changes in VM and VS, the global performance statistics should be treated cautiously. Despite the caution about using a universal differentiator, the MSE quintic spline algorithm was highly stable over the entire parameter space simulated here and can be generally recommended. When outperformed by another algorithm, both the difference between the estimates and the error from the true value were very small. A potential, but unexplored, problem with the MSE algorithm is its sensitivity to misestimates of the input error.
Corradini et al. (1993) compared the second-derivative performance of five different numerical differentiators including a heptic spline version of MSE (the four other algorithms in their study were not investigated here). Five simulated functions sampled between 50 and 100 Hz were used for the analysis. Supporting the results reported here, Corradini et al. (1993) showed that the MSE algorithm performed better than the other four functions for estimates of the entire acceleration profile.
The GCV performed very well over much of the parameter space. For Amax, in particular, at least 50% of the GCV estimates had relatively small errors at nearly all combinations of VM and VS. At high VS, however, the GCV algorithm proved unstable, often resulting in severely overestimated velocity and acceleration estimates. This suggests that, in general, if the GCV acceleration profile does not look markedly noisy, the resulting estimates should be reasonable. It is somewhat surprising that GCV performed as well as it did at 60 and 125 Hz. Woltring (1985, 1986a) explicitly warned against using the GCV criterion on data sets with fewer than 40 points; in the present study, the ranges of sample sizes of the data at 60 and 125 Hz were 7–8 and 15–16 points, respectively.
Giakas and Baltzopoulos (1997a) recently compared GCV with five other algorithms including PSA3 and Winter’s (1990) residual analysis (RA) method of estimating the optimal Butterworth filter (similar in spirit to BF1 and BF2). Giakas and Baltzopoulos (1997a) evaluated the performance of the algorithms on data from 24 simulated signals sampled at 50 Hz with 30 different levels of added noise. Each signal was sampled only once for a level of added noise. Giakas and Baltzopoulos (1997a) reported global rankings only (i.e. pooled over noise level) for estimates of the whole profile and found PSA3 to have better performance than both GCV and RA for both velocity and acceleration estimates. Actual error levels were not reported, except for a single signal (which, curiously, showed GCV performing better than PSA3 for nearly all levels of added noise).
The MR algorithm is a commonly used method for estimating accelerations in the animal locomotion literature, especially for measures of fast-start performance (Webb, 1977, 1978; Domenici and Blake, 1991, 1993; Kasapi et al. 1993; Law and Blake, 1996). While the MR algorithm performed well for estimating velocities for data sampled at or above 125 Hz, it only performed well for estimates of acceleration at 250 Hz. At lower sampling frequencies, the MR algorithm severely underestimated accelerations; at higher frequencies, it severely overestimated accelerations.
To stabilize edge effects at the beginning and end of a sequence, D’Amico and Ferrigno (1990, 1992) used forward and backward linear prediction to extend the observed sequence prior to filtering. While my implementation of the linear prediction extension worked well for the well-known displacement data of Pezzack et al. (1977), unusually large errors at the ends occurred for many of the simulated sequences analyzed in the present study. My use of quadratic polynomial regression to extend the data resulted in edges with less error than occurred with the linear prediction extension. In their exploration of the PSA differentiator, D’Amico and Ferrigno (1990, 1992) varied the magnitude of the sharpening factor (see Materials and methods) between 1 and 5 and concluded that a value of 1 was best if acceleration maxima are desired but a value of 2–3 tended to minimize the RMSE. D’Amico and Ferrigno’s (1992) conclusion is supported for the simulated fast-start data analyzed here; PSA1, with a sharpening factor of 1, performed much better than the other PSA algorithms for Amax. Indeed, PSA1 was one of the best methods for Amax at low VM and high VS. PSA2–4 all performed similarly, except that PSA3–4 had smaller errors at the edges than did PSA2 (this result is not apparent in the RMSE figures since only the intermediate 80% of the points were used for the calculation of RMSE).
Lanczos’ moving regression (MR) method
The use of numerical differentiation to estimate maximum acceleration was criticized by Harper and Blake (1989), who showed that estimates using typical film speeds (250 Hz) and magnification have an expected error of approximately 40%. Such high errors certainly preclude reasonable reconstruction of locomotor dynamics or comparisons of acceleration performance. Harper and Blake (1989) recognized two sources of errors: sampling frequency error (SFE) and measurement error (ME). While the presence of these errors is inherent in numerical differentiation, their influence on derivative estimates can be minimized by the selection of an appropriate numerical differentiator. Harper and Blake (1989), following Webb (1977, 1978), used Lanczos’ moving regression (MR) algorithm to estimate accelerations (Lanczos, 1956). The pattern of the error distributions for the MR algorithm observed in this study (Figs 3–6), that is, an initial decrease and final increase in per cent error, with increasing VS, closely resembles the expected pattern discussed by Harper and Blake (1989).
A major advantage of the MR algorithm is that it is computationally trivial, while its chief disadvantage is that it does not allow the user to control the degree of smoothing [see Lanshammar (1982a) for a more flexible use of piecewise polynomials]. The second derivative of a quadratic polynomial is constant and, if the function’s second derivative varies, the computed second derivative will be a weighted average of the function’s second derivatives at each of the points used to estimate it. As a consequence, estimates of the maximum second derivative of a function sampled without noise will always be too low. Harper and Blake (1989) refer to this source of error as sampling frequency error because, as the function is sampled at smaller intervals, variation in the function’s second derivative at any five neighboring points becomes increasingly smaller, and the polynomial’s second derivative will approach that of the sampled function.
Algorithms that allow different levels of smoothing are not necessarily prone to excessive underestimation of maximum velocities and accelerations due to low sampling frequency. For each of the ten algorithms investigated in the present study, I give the per cent errors (Table 2) for estimates of maximum acceleration of a function resembling that of Harper and Blake (1989) and sampled without error at 60 Hz (Fig. 7). With the exception of the KPF algorithm, the MR algorithm was most affected by sampling frequency error. In contrast, GCV and MSE each had an error of 0.06% and were, essentially, unaffected by sampling frequency error, as defined by Harper and Blake (1989). Nevertheless, as illustrated in Fig. 6, GCV and MSE were affected by sampling real data with noise at a low sampling frequency. The underestimation of acceleration in the GCV and MSE (and other) algorithms results from sampling at some time prior to and following peak acceleration, but not at the peak itself, a problem that increases with decreasing sampling frequency. While only some methods are affected by oversmoothing due to low sampling frequency (Harper and Blake’s sampling frequency error), all methods are affected by missing an event due to low sampling frequency (note that, in the heuristic example of Fig. 7, the peak was intentionally sampled to avoid this source of sampling frequency error).
Harper and Blake (1989) argued that the measurement error component of the total error in estimating maximum accelerations increases with increasing film speed because digitizing error, although constant in magnitude, increases relative to the distance moved by the animal between successive frames. As a result, the total error increases at speeds above the optimum (Harper and Blake, 1989). This argument is contrary to the analysis of derivative errors by Lanshammar (1982b), who suggested that errors will continually decrease with increasing VS. Harper and Blake’s (1989) argument may explain the extreme overestimation of Amax for the MR algorithm at 500 Hz when VM was ⩽0.5×. In addition to MR, the BF1, BF2 and GCV algorithms also markedly overestimated maximum accelerations at high VS. Importantly, however, some algorithms were resistant to this source of error, at least at the video speeds and magnitudes analyzed in this study. PSA2, PSA4 and MSE nearly always underestimated maximum accelerations, and the magnitude of this underestimation decreased with VS. PSA1, PSA3 and KPF often overestimated maximum accelerations at high but not at low VS but, again, the absolute error decreased with increasing VS, at least up to 500 Hz. The results of the present study, then, show that changes in error, with respect to VS, are more complicated than suggested by either Harper and Blake (1989) or Lanshammar (1982b).
How to get better results
The performance of the automated differentiating algorithms over the parameter space simulated in this study suggests a more optimistic role for numerical differentiation in studies of comparative performance and locomotor dynamics than concluded by Harper and Blake (1989). Two properties are important when choosing a numerical differentiator: the expected performance of an algorithm and the robustness of the algorithm. The expected performance, represented in this study by an algorithm’s median performance, measures the bias of an algorithm. For example, the MSE algorithm consistently underestimates the actual maximum acceleration. If enough were known about this bias, a correction factor could be added to estimates of Amax. One caveat to using such a correction factor is that not only does the magnitude of the bias vary with VM and VS but it should also vary with the shape of the displacement profile (the effects of which were not measured in this study).
Perhaps more important for comparative performance analyses is an algorithm’s robustness, represented by the error variance of the performance estimates. For example, the median per cent error for Amax by the GCV algorithm at 500 Hz and 1× magnitude was only −0.3%, but more than 10% of the estimates were overestimated by over 100%. In contrast, the MSE algorithm had a slightly higher median per cent error (−9.5%) but a much smaller error variance. As a result, although the MSE may underestimate the actual maximum acceleration, it should underestimate the maximum by a similar magnitude for all individuals.
The results of the video data presented here suggest that even the best numerical differentiation algorithms may result in error variation large enough to preclude comparisons of subtle performance variation. For example, the large errors associated with the MR and GCV algorithms at high VS may explain the failure to find significant differences in maximum acceleration between closely related fish using film data and the MR algorithm (Law and Blake, 1996) or between samples subjected to different treatments using video data and the GCV algorithm (Beddow et al. 1995). Because film data allows much higher resolution than video, error variances should be minimized when using the KPF, PSA, GCV or MSE algorithms in combination with high-speed film.
While the RMSE associated with estimating velocity profiles was low, it was uncomfortably high for the acceleration profiles (the best performers had error rates of 11.1% at 500 Hz and 2× and 27.0% at 500 Hz and 0.25×). It is well known that numerical differentiators are highly unstable at the edges of a sequence. Some of these methods attempted to control for this by artificially extending the sequence either after (KPF) or before and after (BF, PSA) the original sequence. Nevertheless, edge effects were apparent (the reason why the measure of RMSE included only the intermediate 80% of the points). Clearly, there is no substitute for actual measurements of displacement before and after the sequence of interest. Additionally, if using the GCV spline, the added points should increase the accuracy of the estimate of the optimal smoothing parameter.
Given the large amount of data in this study, I did not explore the causal explanation of the high RMSE; that is, if the general pattern of peaks and troughs present in the true acceleration profile was faithfully reconstructed despite the large RMSE or if the entire shape of the profile was incorrectly estimated. This is an important distinction because completely misshapen estimates of acceleration profiles could lead to highly inaccurate dynamic models of animal movement. A useful addition for future simulations would be to explore correlates of algorithmic performance. For example, do we have more confidence in an estimate if all algorithms give similar values? Similarly, numerically estimated acceleration profiles of the center of mass during steady swimming in a fish showed little variation within swimming speed, either among sequences or among individuals (Walker and Westneat, 1997). Does the similarity in these sequences indicate good estimates of the acceleration profile or were all of the profiles incorrectly estimated in the same way? Additional simulations could investigate whether we should have more confidence in our numerically differentiated results, given a similarity in profile shapes.
Finally, time-averaged performance measures have been suggested as alternatives to instantaneous estimates in order to avoid the potentially large errors resulting from numerical differentiation. Indeed, many studies of fast-start performance report only the total distance travelled in some behavior or a velocity based on this distance and the duration of the travel (Webb and Skadsen, 1980; Taylor and McPhail, 1986; Eaton et al. 1988; Jayne and Bennett, 1990; Norton, 1991; Swain, 1992; Watkins, 1996). While time-averaged measures of performance do not suffer from exponentially magnified error propagation, they do suffer from masking proximate explanations of performance variation. Many of the critical features of the acceleration profiles measured by Harper and Blake (1990, 1991), including time to peak acceleration and number of acceleration peaks, could not have been discovered using time-averaged performance measures. More importantly for studies of comparative performance, the variation in acceleration profiles among individual fast starts (Harper and Blake, 1990, 1991) demonstrates the complex relationship between the distance traveled during some arbitrary time and maximum acceleration. This complex relationship suggests that time-averaged measures of fast-start performance may not be a good indicator of acceleration performance. Inter-individual and intra-individual escape behaviors are highly variable and context-dependent (e.g. Huntingford et al. 1994). It seems likely that both high initial accelerations (e.g. rapid jumps) and high average velocities (e.g. escapes into a refuge) can be closely related to survival, but the relative influence of each may depend on the context of the response. Because of both the complexity of fast-start movements and the variation in escape behavior, time-averaged measures cannot replace instantaneous measures of escape performance.
ACKNOWLEDGEMENTS
I thank Will Corning, Nora Espinoza, Melina Hale, Mark Westneat, Brad Wright and two anonymous reviewers for greatly improving the manuscript. This research was performed while on the generous support of a National Science Foundation Postdoctoral Research Fellowship in the Biosciences Related to the Environment.