Most of the major invertebrate phyla appear in the fossil record during a relatively short time interval, not exceeding 20 million years (Myr), 540-520 Myr ago. This rapid diversification is known as the ‘Cambrian explosion’. In the present paper, we ask whether molecular phyloge netic reconstruction provides confirmation for such an evo lutionary burst. The expectation is that the molecular phy logenetic trees should take the form of a large unresolved multifurcation of the various animal lineages. Complete 18S rRNA sequences of 69 extant representatives of 15 animal phyla were obtained from data banks. After elimi nating a major source of artefact leading to lack of resolu tion in phylogenetic trees (mutational saturation of sequences), we indeed observe that the major lines of triploblast coelomates (arthropods, molluscs, echinoderms, chordates…) are very poorly resolved i.e. the nodes defining the various clades are not supported by high bootstrap values. Using a previously developed procedure consisting of calculating bootstrap proportions of each node of the tree as a function of increasing amount of nucleotides (Lecointre, G., Philippe, H. Le, H. L. V. and Le Guyader, H. (1994) Mol. Phyl. Evol., in press) we obtain a more infor mative indication of the robustness of each node. In addition, this procedure allows us to estimate the number of additional nucleotides that would be required to resolve confidently the currently uncertain nodes; this number turns out to be extremely high and experimentally unfea sible. We then take this approach one step further: using parameters derived from the above analysis, assuming a molecular clock and using palaeontological dates for cali bration, we establish a relationship between the number of sites contained in a given data set and the time interval that this data set can confidently resolve (with 95% bootstrap support). Under these assumptions, the presently available 18S rRNA database cannot confidently resolve cladogenetic events separated by less than about 40 Myr. Thus, at the present time, the potential resolution by the palaeontolog ical approach is higher than that by the molecular one.

The notion that most lines of metazoans appear rather abruptly in the fossil record, during the Cambrian, unpreceeded by iden tifiable forerunners, dates back to the 19th century. In fact, this observation was a point of serious concern to Darwin who devoted several pages in ‘The origin of species’ (1859) to discuss the possible reasons for the absence of identifiable Pre cambrian fossils. The notion of a ‘Cambrian explosion’ of animal life is now amply documented and is particularly striking in the richness of the Burgess Shale fauna, which dates from the mid-Cambrian (approx. 520 Myr ago) but also in its slightly earlier (530-535 Myr ago) close strata of Sirius Passet (Greenland) and Chengjiang (China). Briefly, the present view of the palaeontological evidence, as summarised by Conway Morris (1993 and this volume; see also Bowring et al., 1993), recognises three early episodes of animal life: (i) the pre Cambrian Ediacara fauna (570-555 Myr ago), probably mostly of diploblastic ‘grade’ of organisation, and which may be unrelated to (ii) the later Cambrian fauna, largely dominated by triploblasts, which appears at the base of the Cambrian (approx. 540 Myr ago) and then radiates -explosively during (iii) the third episode, yielding representatives of most of the 35 major metazoan phyla within an interval of probably less than 20 million years. Thus, the major types of body plans of metazoans may have originated during a relatively short time interval.

These observations are of considerable interest in under standing the mechanisms of large scale evolution and the role that developmental innovations may have had in shaping animal diversity. It is therefore important that they be sub stantiated by independent lines of evidence. The purpose of the present paper is to inquire as to whether phylogenetic analysis of gene sequences from extant metazoans might provide con firmation for the occurrence of such an evolutionary burst of triploblasts. The central argument runs as follows: if the split between the various animal phyla took place within a short time interval as compared to the length of time elapsed since its occurrence (say 10 Myr as compared to 500 Myr), one expects that determination of the order of emergence of the various lineages using sequence data will be almost imposs ible, i.e. that the various animal phyla will emerge as an unre solved ‘bush’ in the molecular phylogeny. The basic reason for this expectation is that the molecular events allowing one to establish the order of emergence of the various clades on a tree are the mutations that occur on the ‘internal branches’ of the tree, in between the points of emergence of the clades under analysis: these are the synapomorphies (shared derived char acters) uniting the successive clades into a series of nested groups. The longer the time interval between two cladogenetic events the higher the probability that mutations will have accu mulated within the corresponding branch in the tree and therefore the clearer the kinship of the taxa located after the branch will be.

The idea, then, is to reverse the argument and to assume that if we cannot satisfactorily resolve a multifurcation in a molecular phylogeny, it is because the time interval separating the emergence of the various clades involved has been too short with respect to the time elapsed since the event; thus, unresolved nodes in a molecular phylogeny would be inter preted as corresponding to an evolutionary radiation. This is indeed what has repeatedly been observed in the molecular phylogeny of Metazoa. However, it should be realised at the onset that the use of this argument rests on two essential para meters, a true historical one, the duration of time separating two cladogenesis events, and a methodological one, namely our ability, through our tree construction methods, to discrim inate ‘well.resolved’ ·nodes from ‘unresolved’ ones. From the start, then, we see that this approach to the question of the Cambrian radiation is intimately linked to the problem of eval uating the reliability of nodes in molecular phylogenies. The short history of the molecular phylogeny of Metazoa provides a good illustration of how these evaluations started in a rather intuitive and non-quantitative way to become increasingly rigorous.

Starting with the pioneering study of Field et al. (1988) for example, which was based on partial 18S ribosomal RNA sequences analysed by a distance method, the authors stressed that the order of emergence of the four major groups of coelo mates they analysed, Chordata, Echinodermata, Arthropoda and a set of ‘eucoelomate protostomes’ could not be confi dently resolved and suggested that this reflected a rapid phyletic splitting, i.e. a rapid radiation of all coelomate phyla. Their arguments were that the internal branches separating the points of emergence of the various taxa were short and, more importantly that the topology was unstable i.e. that it changed depending on the actual species sampled. These data were reanalysed by Patterson (1989) using a variety of methods, in particular parsimony, and by Lake (1990) using his evolu tionary parsimony method (see Erwin, 1991 for a review of these papers). Contradictions and uncertainties in the results suggested both that a rapid radiation of eucoelomates may indeed have occurred and that the data were ‘noisy’. 28S rRNA partial sequences of a broad sample of ‘invertebrate’ species covering ten triploblastic coelomate phyla, three pseudo-coelomates ones and one acoelomate were also obtained and analysed by one of us (Chenuil, 1993) with very similar results.

Later, two of us (Adoutte and Philippe, 1993) reanalysed a broader 18S rRNA partial sequence data set, using parsimony methods and bootstrap testing and clearly confirmed three points: (1) diploblasts were deeply split from triploblasts (as had been inferred by Raff et al. (1989) and by Christen et al. (1991); (2) platyhelminths (acoelomate triploblasts) were the sister group of coelomates and (3) the major coelomate phyla were very poorly resolved. A ‘giant’ multifurcation, compris ing annelids, arthropods, molluscs, echinoderms, chordates and many more minor groups provided a fair representation of the results. Even the separation into the two major lineages of coelomates, protostomes and deuterostomes, although apparent in the tree, was not supported by significant bootstrap values. We suggested at that time that the latter point could be a reflection of the Cambrian explosion.

In the past ten years, much use has been made of the bootstrap value (or bootstrap proportion, BP) to estimate the reliability of nodes in phylogenetic trees (Felsenstein, 1985). The BP (the number of times a given node is obtained over the total number of nucleotide resamplings carried out) is indeed a convenient value to estimate the strength of the phylogenetic signal, within the framework of a given tree reconstruction method: values above 95% indicate that the data contain a strong signal in favour of this node while low values indicate that the node is poorly supported and thus in fact may not exist (see Zharkikh and Li, 1992a,b; Hillis and Bull, 1993 and Felsenstein and Kishino, 1993 for recent detailed analyses of the significance and statistical properties of the bootstrap). Thus, an expectation of a radiation process is that bootstrap proportions should be low in all the internal branches sur rounding the radiation point in the tree.

A corollary of this criterion is the instability of the node: when poorly supported by the bootstrap, nodes often display instability in the face of variations in the length of sequence analysed and, more significantly, in the face of modifications of species sampling (as systematically analysed in Lecointre et al., 1993). That point is strikingly illustrated in the paper by Adoutte and Philippe (1993) where the addition of a single new species to the tree transforms the Metazoa from monophyletic (diploblasts + triploblasts) to biphyletic (diploblasts on one branch and triploblasts on another), in agreement with the low BP (60%) of the corresponding node. This provided an indi cation of the difficulty of solving this question and was inter preted as confirming the depth of the split between diploblasts and triploblasts.

In the present work, we have carried out a new analysis of an ‘edited’ database of 18S rRNA, eliminating fast evolving species and analysing the significance of the nodes in the trees by a procedure recently developed within our group. This involves calculating not only the bootstrap proportion at each node on a tree, but also determining how this value changes when different lengths of sequence are included in the analysis (Lecointre et al., 1994). This allows us to obtain a curve of the BP as a function of the number of nucleotides used, which is more informative than the mere BP value based simply on a single sequence length. In particular, this procedure allows one to estimate the number of additional nucleotides that would be required to transform a ‘moderately supported’ node into a strongly supported one. This is then combined with palaeon tological data to establish a rough relationship between the length of time separating two cladogenesis events, the length of the corresponding branch and the amount of sequence information required to support it in a statistically significant way. When applied to the rRNA dataset of Metazoa, this provides an estimate of the sequencing effort that would be required to resolve closely spaced branching points in the phylogeny, an effort that turns out to be enormous and unreal istic in most cases.

Sequences used

Only species for which the full 18S rRNA sequences is available have been used in the present study. As of December 1993, this corre sponded to 69 species of Metazoa in the EMBL and GenBank data banks. All the sequences were handled and further analysed through the MUST package (Philippe, 1993), and are available upon request.

Alignment and tree construction

The sequences were aligned manually using the editing functions of the MUST package (Philippe, 1993). Only confidently aligned domains were used, using stringent criteria to eliminate all doubtful portions. The boundaries of the domains thus selected are as follows, using mouse 18S rRNA nucleotide numbering as a reference: 82-125, 137-180, 187-194, 208-243, 289-307, 311-539, 548-689, 798-834,841-1112, 1122-1407, 1440-1551, 1558-1737.

For the 69 species, this yielded a total of 1615 aligned sites of which 1010 were variable and 690 informative under the parsimony criterion. When fast-evolving species are eliminated (see Results), 55 species remain, yielding 1474 aligned sites of which 708 are variable and 486 informative.

Trees were constructed using the Neighbour Joining method (Saitou and Nei, 1987) and were submitted to bootstrapping (Felsen stein, 1985) using the NJBOOT program of the MUST package set at 1000 resamplings. All the bootstrap calculations were carried out on a Sun-Spare 10 computer.

Calculation and display of the ‘pattern of resolved nodes’ (PRN)

The method described by Lecointre et al. (1994) was used through out, under the following conditions. The full aligned sequences of the 55 species were each submitted to random sampling of a given number of sites (a ‘jack-knifing of sites’) through the use of a new program, PRN, running on UNIX platforms. Thirteen different sequence lengths were chosen (25, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500 and 600 sites) and for each, 200 samples were drawn. Thus, a total of 2600 subsets of sequence alignments were obtained each including all 55 species. Each of these subsets was used to construct an NJ tree, which was submitted to 1000 bootstrap repli cates. All the combinations of species appearing in more than 1% of the replicates were ‘stored’ in a file (of about 60 Mb). This yields several tens of thousands of nodes. Selection of the nodes was then carried out using the new program AFT-PRN according to the following criteria: the node should correspond to a BP with an ascending tendency and it should be present in more than 2000 of the subsets of sequences. This is a rather stringent criterion allowing us to keep only nodes that appear frequently. At a given node, one could therefore display graphically the evolution of BP as a function of the number of nucleotides that were used to generate the tree (COMP BOO program of the MUST package or DISPLAY-PRN under UNIX).

Lecointre et al. (1994) have shown that the mean of BP can be related to the number of nucleotides, x, through the function BP = 100 (1-e-b (x-x′)). The parameters b and x’ are estimated by non-linear regression using the GENSTAT package.

Relationship of ‘b’ to branch length and to time

Branch lengths were directly provided by the NJ program and displayed using the TREEPLOT program. These lengths were plotted as a function of the value of the b parameter for the corresponding branch.

The following palaeontological dates were used to establish the relationship between the time elapsed between two cladogeneses and the b value: 300 Myr between the point of origin of Pectinidae (as represented by Placopecten) and that of Mactridae (as represented by Spisula;Rice et al., 1993), 50 Myr between the point of origin of gnathostomes and that of all vertebrates and 300 Myr between the point of origin of amniotes and that of eutherians (see Benton, 1990).

Relative rate test

The mean values of the distances between each of the triploblast species and the 8 diploblast ones was computed using either all types of nucleotide differences or only the gaps (Table 1).

Table 1.

Results of the relative rate test for a selected sample of species

Results of the relative rate test for a selected sample of species
Results of the relative rate test for a selected sample of species

Saturation curve

The number of inferred substitutions was calculated using the program PAUP 3 (Swofford, 1991), courtesy of H. Recipon. The cor responding matrix between all pairs of species was obtained through the TREEPLOT program, and the COMP-MAT program allowed the visualisation of the results.

Global 1SS rRNA Neighbour-Joining tree of Metazoa

Fig. 1 shows the Neighbour-Joining (NJ; Saitou and Nei, 1987) tree of all the metazoan species for which complete 18S rRNA sequences were available in data banks as of December 1993. Partial sequences were excluded in order to maximise the amount of information for each species. This database contains representatives of 13 metazoan phyla including the most numerically important ones, with only one unfortunate omission, that of annelids, for which complete sequences are not available. The tree is arbitrarily rooted between diploblasts (Porifera, Ctenophora, Placozoa and Cnidaria) and triploblasts to avoid the use of a non-metazoan outgroup (which would decrease the length of alignable sequences) and because diploblasts constitute a clear outgroup to triploblasts on the basis of multiple previous evidence. The tree displayed is that directly obtained by the NJ method, a distance method that does not assume equality of evolutionary rates among branches and whose efficiency at recovering the actual tree has been shown to be reasonably good (Tateno et al., 1994). It is used here because of its great rapidity in terms of computer time, a critical advantage for the extensive calculations carried out in this paper.

Fig. 1.

Phylogeny of Metazoa based on complete 18S rRNA sequences treated by the Neighbour-Joining method. Numbers below internal branches indicate the bootstrap proportion of the corresponding node (1000 resamplings). Note the length of the branches of all the species of Nematoda and of two Insecta (Drosophila melanogaster and Aedes albopictus).

Fig. 1.

Phylogeny of Metazoa based on complete 18S rRNA sequences treated by the Neighbour-Joining method. Numbers below internal branches indicate the bootstrap proportion of the corresponding node (1000 resamplings). Note the length of the branches of all the species of Nematoda and of two Insecta (Drosophila melanogaster and Aedes albopictus).

The tree displays a number of interesting features and also immediately illustrates one major source of artefact: several species or groups of species display much longer branches, indicating either a two- or three-fold higher rate of evolution in those sequences or inaccurately determined sequences. Such inequalities are known to generate topological errors in the positioning of the corresponding taxa (Felsen-stein, 1978; Hendy and Penny, 1989; Swofford and Olsen, 1990). A clear example is provided by two insects, Drosophila and Aedes, which emerge as a sister group to nematodes (the latter also all having long branches) and separate from the five other arthropods, which have smaller branches and are more traditionally positioned in the tree. Both for Drosophila and for Caenorhab ditis it is known that the problem lies not in the quality of the sequences but in their rapid rate of evolution, as has been pointed out in several previous papers.

In spite of these sources of errors, Fig. 1 displays a number of features that deserve a brief comment.

  • (1) The distance between diploblasts and triploblasts is indeed the largest measured in the tree between high level taxa supporting the validity of the rooting.

  • (2) Except for the nematodes, and an acantho cephalan (Moliniformis), all the other major triploblast lineages emerge very close to each other, i.e. separated by very short interval branches and correspondingly low BPs.

  • (3) Nematodes and the acanthocephalan, tra ditionally grouped within the pseudocoelomates, seem to emerge between the diploblasts and the platyhelminths, i.e. acoelomate triploblasts. This is contrary to the usual view which places pseudocoelomate emergence between acoelo mates and eucoelomates. However, because of the great inequalities in rates of evolution and the very poor BPs in this portion of the tree, this result should not be over emphasised.

  • (4) Contrary to our previous study, based on a parsimony analysis of partial 18S rRNA sequences (Adoutte and Philippe, 1993), the monophyly of coelomates is not strongly supported (33% BP) and the positioning of the platyhelminths as a sister group to coelomates is correspondingly weakened.

In summary, the view emerging from this initial tree is one of a large burst of all triploblas tic metazoans, including platyhelminths. But this view should be considered with caution because of the inequalities in evolutionary rates observed within the tree.

In a second step, the relative-rate test (Sarich and Wilson, 1973) was systematically carried out on all the species of the tree (Table 1). The distribution of distances to the outgroup for the 55 remaining species is roughly gaussian and ranges from 14.6 to 18.3 whereas the distances for the fourteen discarded species vary from 19.7 to 25.7, noticeably far from gaussian. All fourteen taxa with too high a rate were discarded, yielding the more restricted set appearing in Fig. 2, which was used in all further calcu lations.

Fig. 2.

Phylogeny of Metazoa derived in the same fashion as that of Fig. 1 except for the removal of fast evolving lineages; the nodes designated by the letters A, B, C, D and E are those whose PRNs are displayed in Fig. 4.

Fig. 2.

Phylogeny of Metazoa derived in the same fashion as that of Fig. 1 except for the removal of fast evolving lineages; the nodes designated by the letters A, B, C, D and E are those whose PRNs are displayed in Fig. 4.

When the rapidly evolving species are eliminated from the database, according to this criterion, several discrepancies of Fig. 1 disappear, and the BPs rise (Fig. 2). However, some interesting taxa are discarded, most notably nematodes. Among the salient points now emerging are:

  • (i) A confirmation of the outgroup status of Platyhelminthes with respect to coelomates (but still with a weak support for the monophyly of coelomates, i.e. 49% BP).

  • (ii) Higher BPs for the monophyly of coelomate protostomes and deuterostomes (75% and 69% respectively).

  • (iii) High BPs for several taxa known from independent analyses to be monophyletic, such as arthropods, molluscs, echinoderms and vertebrates. This result should, again, not be over-emphasised because the sampling is limited and biased in some of these taxa (for example there is a large over-repre sentation of bivalves within the molluscs). All these conclu sions are very similar to those of Winnepennickx et al. (1992) who used a similar data set and similar methods.

  • (iv) In contrast, a few inconsistencies within these mono phyletic groups are observed, such as the position of Limicolaria, a gastropod, amongst the bivalves within the molluscs. Several incongruities are also observed within the vertebrates such as the point of emergence of mammals and the position of chondrichthyans. This may well be due to the fact that more rapidly evolving portions of the rRNA were discarded from this analysis (to enable analysis of very distant taxa), with a resultant loss of information appropriate for more closely related ones such as ver- tebrates. The topology observed within deuterostomes, with echinoderms, hemi- chordates and urochordates as a sister group to cephalochordates and verte brates, is also contrary to traditional zoo logical views. However, the very same topology was very recently reached by Wada and Satoh (1994) using a variety of tree construction methods.

Thus the view changes slightly from one that suggested complete lack of reso lution of the order of emergence of the various clades to one starting to display some significant pattern but still with low resolution. Can this lack of resolution be the result of yet another type of bias in the data?

We have shown previously that muta tional saturation of sequences can strik ingly decrease the resolution of several nodes in a tree by bringing the corre sponding branching points artefactually much closer to each other (Philippe et al., 1994; Philippe and Adoutte, 1994). We therefore verified that the relatively low resolving power of 18S rRNA was not due to saturation problems. This is easily achieved by plotting the number of inferred substitutions between all pairs of species as deduced from a parsimony algorithm, against the actual number of differences recorded between the extant sequences. When saturation is present, the molecular distances measured between extant sequences level off while the distances deduced from the tree continue to rise. This procedure was applied to the rRNA data set used in the present study and it can be seen in Fig. 3 that a slight sat uration is discernible but that it is weak and we think it is not likely to perturb severely the interpretations. The data are therefore suggestive of a true radiation.

Fig. 3.

Comparison of patristic distances (inferred numbers of substitutions) versus number of observed nucleotides differences between the sequences used to construct the tree of Fig. 2 (see text).

Fig. 3.

Comparison of patristic distances (inferred numbers of substitutions) versus number of observed nucleotides differences between the sequences used to construct the tree of Fig. 2 (see text).

A closer look at critical nodes in the tree: the PRN method

Instead of simply examining the values of the BP at important nodes as a criterion of robustness of the corresponding nodes, we have recently introduced a procedure of BP analysis which involves following the value of BP as a function of an increasing number of nucleotides taken into account (Lecointre et al., 1994; see Materials and Methods). Various types of curves are thus obtained. Furthermore it has been shown that the shape of the curve in the case of resolved nodes (PRN) fits closely a simple monomolecular function of the form:

formula

where x is the number of informative sites, and where b and x’ are parameters specific to each node, which can be estimated directly from the data through non-linear regression. In fact, it was shown that the b value from this equation provides an accurate descriptor of the curve because x’ is always close to 0. bis much more informative than a single BP value since it is correlated at the same time to BP and to the general shape of the curve.

The most interesting part of this procedure lies at the high extremes of BP: for these values of BPs, it discriminates between cases in which this high value is reached very quickly (i.e. using a small number of nucleotides) and those in which it is reached more slowly (i.e. after using a much larger number of nucleotides); thus, even within the category of fully resolved nodes, a discrimination can be made between those that are extremely strong and those that are moderately strong.

Let us take some examples from nodes displayed in Fig. 2. Node A and node B correspond respectively to the monophyly of vertebrates and arthropods; both are highly supported by the BP: 100% and 98% respectively. Their PRN, displayed in Fig. 4A,B, are clearly of the fully resolved type, with BPs reaching 98-100% with the 18S rRNA data already available. The shape of the curve is substantially different however in these two cases: the slope is extremely steep in the case of vertebrates, the plateau being reached after using only a very small portion of the available nucleotides, while in the case of arthropods the plateau is reached much more gradually and the full complement of nucleotides is required to reach the highest BP. In fact, the monophyly of arthropods is known, from extensive previous work, to be difficult to establish using rRNA sequence data (Turbeville et al., 1991), while that of vertebrates is a robust feature of phylogenies based on very diverse characters, both molecular and anatomical. One possible interpretation of the difference between these two clades is that diversification into the various classes occurred earlier in the history of the phylum Arthropoda than in the ver tebrates.

Fig. 4.

The patterns of resolved nodes (PRN) for 5 nodes taken from the phylogeny of Fig. 2. In each graph, the 200 bootstrap proportions (ordinate) obtained for a given number of nucleotides are plotted vertically as a function of increased number of nucleotides taken into account (abscissa) with increments of 50 nucleotides at each step until the full number of variable sites (about 700) of the 18S rRNA data set is included. The graphs correspond respectively to the monophyly of vertebrates (A), that of arthropods (B), that of coelomate protostomes (C), and that of deuterostomes (D). Graphs E and F correspond to two possible contradictory topologies at the base of triploblasts: one which places Platyhelminthes as the outgroup to all coelomates (as shown on Fig. 2, node E) and that which incorporates Platyhelminthes within triploblasts (node F, not shown). The open circles on each set of vertical points correspond to the average value of all the bootstrap proportions. The general shape of the curve joining these open circles has been previously found to be described by a function of the type BP = 100 (1-e-b (x-x’)). Parameters b and x’ can be obtained from the experimental data through non-linear regression and introduced back into the function to draw the curves as shown.

Fig. 4.

The patterns of resolved nodes (PRN) for 5 nodes taken from the phylogeny of Fig. 2. In each graph, the 200 bootstrap proportions (ordinate) obtained for a given number of nucleotides are plotted vertically as a function of increased number of nucleotides taken into account (abscissa) with increments of 50 nucleotides at each step until the full number of variable sites (about 700) of the 18S rRNA data set is included. The graphs correspond respectively to the monophyly of vertebrates (A), that of arthropods (B), that of coelomate protostomes (C), and that of deuterostomes (D). Graphs E and F correspond to two possible contradictory topologies at the base of triploblasts: one which places Platyhelminthes as the outgroup to all coelomates (as shown on Fig. 2, node E) and that which incorporates Platyhelminthes within triploblasts (node F, not shown). The open circles on each set of vertical points correspond to the average value of all the bootstrap proportions. The general shape of the curve joining these open circles has been previously found to be described by a function of the type BP = 100 (1-e-b (x-x’)). Parameters b and x’ can be obtained from the experimental data through non-linear regression and introduced back into the function to draw the curves as shown.

Nodes C and D are especially interesting in the context of the present paper since they correspond to the monophyly of coelomate protostomes and deuterostomes respectively. Their BPs are 75% and 69% and they yield PRNs (Fig. 4C,D) that are clearly different from the two just discussed in that, although the curve rises steadily, they do not reach 100% BP with the available amount of data. In such a case it is interesting to compute, on the basis of these two curves, the number of nucleotides that would be needed to yield a value of 95% BP. This is respectively 1300 and 2100 variable nucleotides (and thus about a threefold higher total number of nucleotides to be sequenced) in each of the 55 species displayed in Fig. 2. Since the complete 18S rRNA was used already, one could turn to 28S rRNA, assuming that its rate of evolution is roughly similar. However, even complete 28S rRNA would not be suf ficient in the present case. Thus, although both the BP and overall shape of the PRNs for these two nodes are intuitively in favour of the monophyly of the two corresponding groups, it is seen that in order to establish the point definitively using a stringent criterion would require considerable experimental effort. Such a situation is even more drastically illustrated in the case of the node which suggests the monophyly of a group composed of the echinoderms, urochordates and hemichordates on Fig. 2. This node is supported by a BP of only 46% and, as indicated above, is contrary to the traditional zoological assumption which groups this set of taxa paraphyletically with the other deuterostomes (cephalochordates and vertebrates) into a single large monophyletic unit. In this case, reaching a value of BP of 95% would require 3500 variable nucleotides.

Node E, supporting the monophyly of triploblastic coelo mates again has a medium BP (49%) and a slowly ascending PRN (Fig. 3E). Confirmation of the monophyly of coelomates would require 2600 variable nucleotides. This is in contrast to our previous work, based on shorter sequences of 18S rRNA and a different taxonomic sampling which more clearly supported the monophyly of coelomates (75% BP) with platy helminths as a sister group (Adoutte and Philippe, 1993). This is probably a reflection of the strong impact of the particular species sampling under analysis as analysed in detail in Lecointre et al. (1993). One further result illustrating the dif ficulty in solving this node can be found in the fact that an alter native topology, that which groups platyhelminths with coelomate protostomes, displays a PRN very similar to that obtained in the case of the monophyly of coelomates (Fig. 4F); in fact, the hypothetical node displayed in Fig. 4F requires the same number of nucleotides to be solved as that of Fig. 4E (2600), indicating nearly identical support for two opposing topologies. An ascending shape therefore does not necessarily indicate that the BP will continue to grow with increasing sequence length, but that the actual sampling of sites and species lead to this pattern by chance only. One must keep in mind that our new approach only allows the number of nucleotides to be sequenced to be estimated. It is necessary to obtain more data to establish the ‘good’ phylogenetic pattern (here the monophyly of coelomates).

At any rate, both the data for the monophyly of coelomates and for that of coelomate protostomes and deuterostomes conform to the idea that these various branchings are difficult to resolve and may therefore correspond to a rapid radiation.

As stated in the Introduction, our ability to resolve nodes in a phylogenetic tree critically depends on the differences that accumulate within the internal branches of the tree. Instead of simply trying to define resolved versus unresolved nodes, could it now be possible to approach the relationship between the length of time elapsed between two cladogenesis events and the length of the corresponding internal branches of the tree? That is, is it possible to estimate the smallest time interval between two events that we are able to identify with confidence using given molecular tools? If such a relationship could be established for a given molecule such as rRNA, then we would be in a position to define its resolving power in terms of millions of years.

The resolving power of rRNA

To answer the questions just raised, we need to go through three successive steps and make two assumptions. We start from the PRN function described above which relates the BP of a given node to b and to the number of nucleotides; this function simplifies to

formula

by assuming x’ is negligible. Since we can calculate the value of b indepen dently from x and if we set the BP at a required value (such as 95%), then we can calculate x as a function of b. If, fur thermore, we have related b to time, then we can calculate x as a function of time, that is, for a given time interval expressed in millions of years (Myr), we can calculate the number of nucleotides that a given molecule requires to be dis criminative.

We will therefore first establish that branch lengths and the b value defined above for the PRN formula are corre lated, as could be expected; then we will show that absolute time is correlated to b using values derived from the fossil record that enables us to carry out cali brations by extrapolating values derived from palaeontology. At that stage we make the two assumptions, namely that internal branch lengths, as determined through the NJ method, reflect ‘true’ branch lengths, that is they are propor tional to the real number of mutational events that have been fixed between two cladogeneses, and second, that branch lengths are directly proportional to real time, that is we assume a molecular clock. Having now a relationship between b and time, we can reincorporate this information into the initial formula relating b to the number of nucleotides and deduce the shape of the curve relating the number of variable sites to time.

The relationship of b to branch length is shown in Fig. 5. It can be seen that the experimental points fit a regression line rather well; in addition, it was observed that the quality of the correlation did not depend on the depth of the node in the tree. Thus, the first requirement is satisfied.

Fig. 5.

Correlation between the length of internal branches as determined by the Neighbour-Joining method and the value of parameter b (see text).

Fig. 5.

Correlation between the length of internal branches as determined by the Neighbour-Joining method and the value of parameter b (see text).

The relationship of b to absolute time is shown in Fig. 6. Only three time points, listed in Material and Methods, were available. The regression is therefore based on only three points. This nonetheless allows us to calculate a rough estimate of the value of k in the formula

formula

were k is a proportionality constant and ΔTc is the time interval separating two cladogeneses. k was thus calculated to be equal to 0.000108.

Fig. 6.

Relationship between the time elapsed between two cladogeneses and the b value (see text).

Fig. 6.

Relationship between the time elapsed between two cladogeneses and the b value (see text).

We then substitute this expression of b in equation (1) and take BP to be equal to 95%. We obtain

formula

from which we extract

formula

We thus see that the number of nucleotides required to resolve a node at the 95% BP level is simply related to time through a constant. We can therefore plot the curve of x as a function of the duration of time separating two cladogeneses having established the value of k above in the case of rRNA. As could be expected from the initial formula the curve obtained (Fig. 7) is a hyperbola.

Fig. 7.

Relationship between the time elapsed between two cladogeneses and the number of variable sites needed to resolve a given node with a 95% BP. The curve is drawn using the function x= 28000/ ΔTc (see text). The three intersects on the ordinate correspond to the number of informative sites, respectively, in partial 28S rRNA sequences (200), full 18S rRNA (708) and full 18S + 28S rRNA (1500).

Fig. 7.

Relationship between the time elapsed between two cladogeneses and the number of variable sites needed to resolve a given node with a 95% BP. The curve is drawn using the function x= 28000/ ΔTc (see text). The three intersects on the ordinate correspond to the number of informative sites, respectively, in partial 28S rRNA sequences (200), full 18S rRNA (708) and full 18S + 28S rRNA (1500).

From this curve, it can be calculated that with 200 variable sites (which is a common number when partial 28S rRNA sequences are used) the discrimination power of the dataset is not better than 140 Myr. With 708 variable sites, which is the case with the complete l 8S rRNA dataset used in the present study, the resolving power is of 40 Myr. If we assume that 18S + 28S rRNA roughly yield to 1500 nucleotides, the resolving power with the complete rRNA unitary cluster improves slightly to 19 Myr but, due to the shape of the curve, improve ment in resolution is clearly not linear with respect to sequence length. For example, to reach a resolving power of 1 Myr, 28000 variable nucleotides would be required so that a total of at least 80,000 homologous nucleotides should be sequenced, which is the equivalent of 40 times the 18S rRNA.

The question raised at the beginning of this paper as to whether molecular phylogenies of extant metazoans allow the visuali sation of the Cambrian explosion in the form of an unresolved multifurcation has been reformulated in the course of the analysis: through the set of tools developed in this and previous work, we have been able to evaluate better the significance of nodes displaying low bootstrap values i.e. ‘poorly resolved nodes’ intuitively interpreted so far as corresponding to a radiation; more importantly, we have suggested a procedure allowing us to assign a time limit to such nodes (as a function of the specific data set under analysis). We are thus in a position to evaluate the resolving power of rRNA sequence data in terms of the minimum time span between two clado genetic events required for it to identify safely the corre sponding internal branch.

The most striking result obtained is that the resolving power of the presently available complete 18S rRNA database is only of about 40 Myr: this is the duration of time required for the molecule to identify a distinct cladogenesis with a bootstrap proportion of 95%. Such a time interval is much longer than those that can be resolved with the presently available radio metric geochronological methods. For example, using uranium-lead zircon dating, Bowring et al. (1993) have recently calibrated early and middle Cambrian rocks with a 5- 10 Myr precision. If the fossil record is of good quality i.e. if it is reasonably continuous, without too many gaps, palaeon tology performs better than 18S rRNA and better even than 18 + 28S rRNA in the sense that it can narrow the time period during which a biological group has diversified better than the molecules can. Obviously, palaeontological dating does not resolve phylogenetic patterns; it simply allows one to estimate the lapse of time during which biological groups have diversi fied. Because methods are now available to evaluate the quality of the fossil record (see Benton, 1994), the comparison of molecular and palaeontological data will become increasingly rigorous.

Is this conclusion confounded by methodological problems? Several parameters other than the relative shortness of the time interval can contribute to obscuring a ‘true’ node in a phylo genetic tree. We are presently studying these parameters sys tematically both on real and on simulated data sets. Of those parameters for which some data are already available, the most prominent are mutational saturation of sequences and inequal ities in evolutionary rates of the different taxa under analysis. We suggest from Fig. 3 that saturation is not prominent in the dataset used here and we have reduced the problem of unequal rates by eliminating fast evolving lineages from the tree. We are aware that several additional pitfalls are conceivable such as biases in species sampling (Lecointre et al., 1993), inequal ities in the density of the topology in certain areas of the tree and level of homoplasy as well as other factors. However, our present simulations suggest that the most important parameter affecting resolution of nodes remains the length of internal branch, i.e. the time interval separating the cladogenesis events (Philippe, unpublished data). To minimise the risks resulting from these various uncertainties, we have used, throughout the present work, a stringent criterion for the bootstrap value, 95%, although several recent studies conclude that 70% is a reason able value in view of the conservative aspect of the bootstrap (Zharkikh and Li, 1992a,b; Hillis and Bull, 1993). Indeed, the impact of species sampling on BP is very strong (as illustrated by substituting a BP of 60% for monophyly of Metazoa to a BP of 60% for polyphyly of Metazoa by the addition of a single species, see Introduction), especially when the internal branch length is short and few species are used (Philippe and Douzery, 1994). Thus, since this phenomenon is not controlled, we believe our conservative criterion for BP of 95% is justified. The conclusion thus holds that the molecular data do point to a radiation of triploblastic coelomate animals if we are willing to qualify a 40 Myr interval as such.

These results have two broad implications in terms of developmental biology. Firstly, superposition of the developmental characteristics of the various lineages over the phylogeny pattern can indicate the order of emergence of developmental traits. Secondly, and more significantly in the context of the present paper, confirmation of the notion of a relatively rapid Cambrian radiation raises a number of interesting questions concerning the mechanisms of diversification of the body plan of Metazoa.

With respect to the first point, we will limit ourselves to four inferences based on the phylogeny.

  • (1) The confirmation of the deep split between diploblasts and triploblasts suggests that diversification into these two major types of metazoan occurred very early in animal evolution. The ancestral state in diploblasts is probably only two tissue layers. The development of an interstitial type of cell in some lineages such as some cnidarians and some ctenophores is probably derived and independent of that which occurred in triploblasts which developed the mesoderm from the outset.

  • (2) The sister group status of the platyhelminths with respect to the coelomates, reasonably well supported in the present study, confirms that an acoelomate triploblastic stage has preceded invention of the coelom in metazoans. Furthermore, the molecular phylogeny is compatible with a single origin of the coelom at a point located between the emergence of platy helminths and the coelomate radiation.

  • (3) The position of platyhelminths also suggests that, within the triploblasts, spiral development is more primitive than radial, if spiral development within coelomate protostomes is taken to be homologous to that of platyhelminths. If such is the case, then radial cleavage is a derived feature.

  • (4) The existence of two major lines of coelomates, the tra ditional coelomate protostomes and deuterostomes, is apparent on our trees and reasonably well supported by the bootstrap. Significantly, the two corresponding nodes yield PRNs sug gesting that the monophyly of each group may ultimately be confirmed if more data is available. This suggests that the embryological character traditionally used to separate these two lineages i.e. blastopore fate is, despite various troublesome exceptions, of real phylogenetic significance with protostomy being the ancestral state. Secondary mouth formation is corre lated with the origin of the deuterostomes. It should be remem bered that blastopore fate is also correlated with mode of origin of mesoderm and coelom formation.

As for the second point, how to account for a rapid diversi fication of body plans during the Cambrian, the following comments can be made. It is now well established that the common ancestor of protostomes and deuterostomes already possessed a diversified complement of HOX class homeobox containing genes belonging to the HOM/Hox complex (Akam, 1989; Schubert et al., 1993). In fact, comparison of insect and vertebrate HOM/Hox complexes indicates that their common ancestor already possessed at least 6 or 7 of the major genes. In addition, it is now clear that the nematode, C. elegans, a ‘pseudocoelomate’ also possesses a HOM/Hox complex of at least four genes (Kenyon and Wang, 1991) and work from several laboratories (Oliver et al., 1992; Webster and Mansour, 1992; Bartels et al., 1993) including ours (Balavoine and Telford, unpublished data) indicates that platyhelminths have a large diversity of Antp-type HOX genes. Finally, such genes have also been identified in diploblasts (Schierwater et al., 1991; Shenk et al., 1993). In view of these data, Slack et al. (1993) have proposed that possession of a basic HOM/Hox complex expressed at the phylotypic stage was a synapomor phy of all Metazoa. It can therefore be hypothesised that the ancestor of coelomates and, in fact, the ancestor of triploblasts already possessed a diversified array of HOM/Hox genes involved in defining broad domains along the anteroposterior body axis. In addition, there is now substantial evidence to indicate that genes involved in major developmental processes such as cell to cell signalling, cell adhesion, cell migration, gene regulation, etc. are of very ancient evolutionary origin (reviewed by Shenk and Steele, 1993). Our suggestion, then, is that the major genetic tools used for carrying our embry ological development, defining axes and polarities, establish ing cell determination and cell differentiation, were already present within Metazoa, prior to the Cambrian radiation. This radiation can therefore be viewed essentially as a tinkering process (Jacob, 1981), variously combining and regulating an already available basic set of genes; because no major genetic innovation was required but rather a shuffling of already available elements and because the various marine ecological niches were essentially empty, one can readily accept that an ‘explosive’ process might have taken place.

The striking conclusion reached in this paper is the shape of the curve relating the number of nucleotides required to resolve a node significantly, and the corresponding time interval: because of the hyperbolic shape of this curve, it can be seen that the number of nucleotides required and hence the sequenc ing effort needed do not rise directly proportionately to the decrease in time but as an inverse relationship. Nodes separated by a long time interval will be consistently resolved with very short sequences while those separated by short intervals will require a disproportionate amount of primary sequence data. This simple rule explains many of the results that have appeared through the years in the phylogenetic liter ature. For example, it is striking that from the early days of molecular phylogeny to the present time, the improvement in the resolution of some points of the phylogeny of vertebrates, as seen through the globin genes, has been modest (see for example Goodman et al., 1987). These, in fact, are usually the same nodes that resisted analysis using l 8S or 28S rRNA (Stock et al., 1991; Le et al., 1993) even when relatively large species samples were used such as in the study of Le et al. (1993). In all these cases, one is probably in the ascending portion of the hyperbola where a considerable sequencing effort is required for a small improvement in resolution. Probably the clearest example of such a situation is in the case of the man-chimpanzee-gorilla tritomy where a considerable increase in data only modestly improved the quality of the res olution (Holmquist et al., l 988a,b; Felsenstein, 1988).

The resolution of difficult nodes using the same type of sequence information will therefore, in most cases, require a dis proportionate experimental effort, usually out of reach. There is an alternative to this strategy, however, which displays a great similarity to that used in traditional comparative anatomy. This involves carrying out comparative anatomy of genomes. The principle is that rare genomic events such as duplications, trans positions, rearrangements, grouping of genes in the form of operons, mitochondrial genetic code, etc. constitute, in some cases, excellent phylogenetic markers. For example, the physical organisation of the genes of the HOM/Hox complexes (see Kappen and Ruddle, 1993 for review) provides several characters that can be used as strong phylogenetic indicators. These rare events, however, are difficult to exploit for two reasons: the methods for using them in a phylogenetic framework are only starting to be developed and the amount of information (number of species) is scarce. Major breakthroughs in collecting this type of data and in the treatment of the data have, nevertheless, recently appeared. For example, Sankoff et al. (1992) have used gene order in mitochondrial DNA to carry out broad scale phylogenies. The authors propose a measure of gene order rearrangement as well as an algorithm and a software to compute it. Such a method could now be applied to a variety of gene complexes for which qualitative information on gene order and organisation is available (many bacterial operons for example). A similar type of approach can be expected to be applicable to the extensive gene order and physical organisation type of information now emerging from the big genome projects. Indeed, the very recent determination of a long portion of chromosome of C. elegans is already yielding much infor mation of this type (Wilson et al., 1994).

We especially thank our colleagues Herve Le Guyader and Guillaume Lecointre for many discussions and contributions to the ideas expressed in this paper. We thank the laboratory of D. Dacunha Castelle (Statistiques appliquees, Universite Paris-Sud, Orsay) for the GenStat program, and are grateful to Max Telford for his critical reading of the text. This work was supported by grants from the CNRS, the Universite Paris-Sud and DRED (Direction de la Recherche et des Etudes Doctorales) for computing equipment. Cecile Couanon is thanked for expert assistance in the preparation of the manuscript.

Adoutte
,
A.
and
Philippe
,
H.
(
1993
).
The major lines of metazoan evolution: Summary of traditional evidence and lessons from ribosomal RNA sequence analysis
.
In Comparative Molecular Neurobiology
(ed.
Y.
Pichón
), pp.
1
30
.
BasekBirkhauser
.
Akam
,
M.
(
1989
).
Hox and HOM: homologous gene clusters in insects and vertebrates
.
Cell
57
,
347
349
.
Bartels
,
J. L.
,
Murtha
,
M. T.
and
Ruddle
,
F. H.
(
1993
).
Multiple Hox/HOM- class homeoboxes in platyhelminthes
.
Mol. Phyl. Evol
.
2
,
143
151
.
Benton
,
M. J.
(
1990
).
Vertebrate Palaeontology
.
London
:
Harper Collins Academic
.
Benton
,
M. J.
(
1994
).
Palaeontological data and identifying mass extinctions
.
Trends Ecol. Evol
.
9
,
181
185
.
Bowring
,
S. A.
,
Grotzinger
,
J. P.
,
Isachsen
,
C. E.
,
Knoll
,
A. H.
,
Pelechaty
,
S. M.
and
Kolosov
,
P.
(
1993
).
Calibrating rates of early Cambrian evolution
.
Science
261
,
1293
1298
.
Chenuil
,
A.
(
1993
).
Etude des relations de parenté entre les principaux groupes d’invertébrés protostomiens par amplification, séquençage et comparaison de portions du gène de l’ARN 28S
.
Thesis, Université Sci. Tech. Languedoc, Montpellier, France
.
Christen
,
R.
,
Ratto
,
A.
,
Baroin
,
A.
,
Perasso
,
R.
,
Grell
,
K. G.
and
Adoutte
,
A.
(
1991
).
An analysis of the origin of metazoans, using comparisons of partial sequences of the 28S rRNA, reveals an early emergence of triploblasts
.
EMBOJ
.
10
,
499
503
.
Conway
Morris
, S
. (
1993
).
The fossil record and the early evolution of the Metazoa
.
Nature
361
,
219
225
.
Conway
Morris
, S
. (
1994
).
Why molecular biology needs palaeontology
.
Development
120
,
Supplement 1
13
.
Darwin
,
C.
(
1859
).
The Origin of Species (6th edition, 1872)
London
:
John Murray
.
Erwin
,
D. H.
(
1991
).
Metazoan phylogeny and the Cambrian explosion
.
Trends Ecol. Evol
.
6
,
131
134
Felsenstein
,
J.
(
1978
).
Cases in which parsimony or compatibility methods will be positively misleading
.
Syst. Zool
.
27
,
401
410
.
Felsenstein
,
J.
(
1985
).
Confidence limits on phylogenies: An approach using the bootstrap
.
Evolution
39
,
783
791
.
Felsenstein
,
J.
(
1988
).
Perils of molecular introspection
.
Nature
335
,
118
.
Felsenstein
,
J.
and
Kishino
,
H.
(
1993
).
Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull
.
Syst. Biol
.
42
,
193
200
.
Field
,
K. G.
,
Olsen
,
G. J.
,
Lane
,
D. J.
,
Giovannoni
,
S. J.
,
Ghiselin
,
M. T.
,
Raff
,
E. C.
,
Pace
,
N. R.
and
Raff
,
R. A.
(
1988
).
Molecular phylogeny of the animal kingdom
.
Science
239
,
748
753
.
Goodman
,
M.
,
Miyamoto
,
M. M.
and
Czelusniak
,
J.
(
1987
).
Pattern and process in vertebrate phylogeny revealed by coevolution of molecules and morphologies
.
In Molecules and Morphology in Evolution: Conflict or Compromise?
(ed.
C.
Patterson
) p.
141
176
.
Cambridge
:
Cambridge University Press
.
Hendy
,
M. D.
and
Penny
,
D.
(
1989
).
A framework for the quantitative study of evolutionary trees
.
Syst. Zool
.
38
,
297
309
.
Hillis
,
D. M.
and
Bull
,
J. J.
(
1993
).
An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis
.
Syst. Biol
.
42
,
182
192
.
Holmquist
,
R.
,
Miyamoto
,
M. M.
and
Goodman
,
M.
(
1988a
).
Higher-primate phylogeny - Why can’t we decide?
Mol. Biol. Evol
.
5
,
201
216
.
Holmquist
,
R.
,
Miyamoto
,
M. M.
and
Goodman
,
M.
(
1988b
).
Analysis of Higher-Primate phylogeny from transversion differences in nuclear and mitochondrial DNA by Lake’s methods of evolutionary parsimony and operator metrics
.
Mol. Biol. Evol
.
5
,
217
236
.
Jacob
,
F.
(
1981
).
Le jeu des possibles
. Fayard
editor
.
Kappen
,
C.
and
Ruddle
,
F. H.
(
1993
).
Evolution of a regularoty gene family: HOM/HOX genes
.
Cur. Opin. Genet. Dev
.
3
,
931
938
.
Kenyon
,
C.
and
Wang
,
B.
(
1991
).
A cluster of Antennapedia-clnss homeobox genes in a nonsegmented animal
.
Science
253
,
516
517
.
Lake
,
J. A.
(
1990
).
Origin of the metazoa
.
Proc. Natl. Acad. Sci. USA
87
,
763
766
.
,
H. L. V.
,
Lecointre
,
G.
and
Perasso
R.
(
1993
).
A 28S rRNA-based phylogeny of the Gnathostomes: first steps in the analysis of conflict and congruence with morphologically based cladograms
.
Mol. Phyl. Evol
.
2
,
31
51
.
Lecointre
,
G.
,
Philippe
,
H.
,
,
H. L. V.
and
Le Guyader
,
H.
(
1993
).
Species sampling has a major impact on phylogenetic inference
.
Mol. Phyl. Evol
.
2
,
205
224
.
Lecointre
,
G.
,
Philippe
,
H.
,
,
H. L. V.
and
Le Guyader
,
H.
(
1994
).
How many nucleotides are required to resolve a phylogenetic problem? The use of a new statistical method applicable to available sequences
.
Mol. Phyl. Evol. (in press)
.
Oliver
,
G.
,
Vispo
,
M.
,
Mailhos
,
A.
,
Martinez
,
C.
,
Sosa-Pineda
,
B.
,
Fielitz
,
W.
and
Ehrlich
,
R.
(
1992
).
Homeoboxes in flatworms
.
Gene
121
,
337
342
.
Patterson
,
C.
(
1989
).
Phylogenetic relations of major groups: conclusions and prospects
.
In Hierarchy of Life. Molecules and Morphology in Phylogenetic analysis
, (ed.
B.
Femholm
,
K.
Bremer
and Jbrnvall
), pp.
471
488
.
Amsterdam
:
Excerpta Medica
.
Philippe
,
H.
(
1993
).
MUST, a computer package of Management Utilities for Sequences and Trees
.
Nucl. Acids Res
.
21
,
5264
5272
.
Philippe
,
H.
and
Adoutte
,
A.
(
1994
).
What can phylogenetic patterns tell us about the evolutionary processes generating biodiversity?
In Aspects of the Genesis and Maintenance of Biological Diversity
(ed.
M.
Hochberg
,
J.
Clobert
and
R.
Barbault
),
Oxford
:
Oxford University Press (in press
).
Philippe
,
H.
and
Douzery
,
E.
(
1994
).
Quartet approach in molecular phylogeny: a note of caution as examplified by the Cetacea/Artiodactyla relationships
.
J. Mam. Evol. (in press)
.
Philippe
,
H.
,
Sorhannus
,
U.
,
Baroin
,
A.
,
Perasso
R.
,
Gasse
F.
and
Adoutte
A.
(
1994
).
Comparison of molecular and paleontological data in diatoms suggests a major gap in the fossil record
.
J. Evol. Biol
.
7
,
247
265
.
Raff
,
R. A.
,
Field
,
K. G.
,
Olsen
,
G. J.
,
Giovannoni
,
S. J.
,
Lane
,
D. J.
,
Ghiselin
,
M. T.
,
Pace
,
N. R.
and
Raff
,
E. C.
(
1989
).
Metazoan phylogeny based on analysis of 18S ribosomal RNA
.
In Hierarchy of Life. Molecules and Morphology in Phylogenetic Analysis
, (eds.
B. Femholm
,
K
.) pp.
247
260
.
Amsterdam
:
Bremer and Jomvall, Excerpta Medica
.
Rice
,
E. L.
,
Roddick
,
D.
and
Singh
,
R. K.
(
1993
).
A comparison of molluscan (Bivalvia) phylogenies based on palaeontological and molecular data
.
Mol. Marine Biol. Biotechnol
.
2
,
137
146
.
Saitou
,
N.
and
Nei
,
M.
(
1987
).
The Neighbor-Joining method: A new method for reconstructing phylogenetic trees
.
Mol. Biol. Evol
.
4
,
406
425
.
Sankoff
,
D.
,
Leduc
,
G.
,
Antoine
,
N.
,
Paguin
,
B.
,
Lang
,
B. F.
and
Cedergren
,
R.
(
1992
).
Gene order comparisons for phylogenetic inference: Evolution of the mitochondrial genome
.
Proc. Natl. Acad. Sci. USA
89
,
6575
6579
.
Sarich
,
V. M.
and
Wilson
,
A. C.
(
1973
).
Generation time and genomic evolution in primates
.
Science
179
,
1144
1147
Schierwater
,
B.
,
Murtha
,
M.
,
Dick
,
M.
,
Ruddle
,
F. H.
and
Buss
,
L. W.
(
1991
).
Homeoboxes in cnidarians
.
J. Exp. Zool
.
260
,
413
416
.
Schubert
,
F. R.
,
Nieselt-Struwe
,
K.
and
Gruss
,
P.
(
1993
).
The Antennapedia-type homeobox genes have evolved from three precursors separated early in metazoan evolution
.
Proc. Natl. Acad. Sci. USA
90
,
143
147
Shenk
,
M. A.
,
Bode
,
H. R.
and
Steele
R. E.
(
1993
).
Expression of Cnox-2, a HOM/HOX homeobox gene in hydra, is correlated with axial pattern formation
.
Development
117
,
657
667
.
Shenk
,
M. A.
and
Steele
R. E.
(
1993
).
A molecular snapshot of the metazoan ‘Eve’
.
Trends Biol. Sci
.
18
,
459
463
.
Slack
,
J. M. W.
,
Holland
,
P. W. H.
and
Graham
,
C. F.
(
1993
).
The zootype and the phylotypic stage
.
Nature
361
,
490
492
.
Stock
,
D. W.
,
Gibbons
,
J. K.
and
Whitt
,
G. S.
(
1991
).
Strengths and limitations of molecular sequence comparisons for inferring the phylogeny of the major groups of fishes
.
J. Fish. Biol
.
39
(
suppl. A
),
225
236
.
Swofford
,
D. L.
(
1991
).
PAUP: Phylogenetic Analysis Using Parsimony
(
Illinois Natural History Survey
,
Champaign, IL
),
Version 3.0 s.
Swofford
,
D. L.
and
Olsen
,
G. J.
(
1990
).
Phylogeny reconstruction
.
In Molecular Systematics
, (ed.
Hillis
,
D. M.
and
Moritz
,
C.
), pp.
411
501
.
Tateno
,
Y.
,
Takazaki
,
N.
and
Nei
,
M.
(
1994
).
Relative efficiencies of the maximum-likelihood, neighbor-joining and maximum-parsimony methods when substitution rate varies with site
.
Mol. Biol. Evol
.
11
,
261
277
.
Turbeville
,
J. M.
,
Pfeifer
,
D. M.
,
Field
,
K. G.
and
Raff
,
R. A.
(
1991
).
The phylogenetic status of arthropods, as inferred from 18S rRNA sequences
.
Mol. Biol. Evol
.
8
,
669
686
.
Wada
,
H.
and
Satoh
,
N.
(
1994
).
Details of the evolutionary history from invertebrates to vertebrates, as deduced from the sequences of 18S rDNA
.
Proc. Natl. Acad. Sci. USA
,
91
,
1801
1804
.
Webster
,
P. J.
and
Mansour
,
T. E.
(
1992
).
Conserved classes of homeodomains in Schistosoma mansoni, an early bilateral metazoan
.
Meeh. Dev
.
38
,
25
32
.
Wilson
,
R.
,
Ainscough
,
R.
,
Anderson
,
K.
et al. , and
Wohldman
,
P.
(
1994
).
2.2 Mb of contiguous nucleotide sequence from chromosome III of C. elegans
.
Nature
368
,
32
38
.
Winnepennickx
,
B.
,
Backeljau
,
T.
,
van de Peer
,
Y.
and
De Wachter
,
R.
(
1992
).
Structure of the small ribosomal subunit RNA of the pulmonate snail. Limicolaria kambeul, and phylogenetic analysis of the Metazoa
.
FEBS lett
.
309
,
123
126
.
Zharkikh
,
A.
and
Li
,
W. H.
(
1992a
).
Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences: I. Four taxa with a molecular clock
.
Mol. Biol. Evol
.
9
,
1119
1147
.
Zharkikh
,
A.
and
Li
,
W. H.
(
1992b
).
Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences: II. Four taxa without a molecular clock
.
J. Mol. Evol
.
35
,
356
366
.