CALIFORNIA INSTITUTE OF TECHNOLOGY
Bi 11 Organismic Biology

Speciation, Phylogeny, and Taxonomy

Natural selection is a mechanism that can cause evolutionary change within a population. This idea is reasonably well understood. The second idea that is covered by the term evolution is the origin of new kinds of organisms, or species. Here we are going to discuss a problem that is not well understood. The reason may be related to the conclusion that an understanding of natural selection is not enough to allow us to predict evolutionary results.
So, this discussion is going to raise questions, and introduce terminology, but not give you many answers.
We might imagine that one possible result of natural selection would be the evolution of a general purpose organism that was so well adapted to life that it would live optimally in whatever circumstances it happened to be born into. At the other extreme, we might imagine a world containing a continuous distribution of organisms, with every possible gene combination represented by a population of organisms living in a portion of the environment for which this gene combination was ideally suited. By itself, the idea of natural selection is compatible with both of these situations, and does not tell us which might be more likely.
Observation, however, tells us that neither of these situations exists. We observe that life on this planet involves millions of distinct species, each specialized for a particular ecologial niche. This distribution of species is discontinuous. In the majority of cases, species can be discriminated, and do not show a continuous distribution of intermediates. Exceptions to this are expected, because the distribution of species is not static, but is the result of evolutionary processes (speciation) and at intermediate stages the distinction between species may be incomplete.
An integral part of the definition of a particular kind, or species, of organism is a definition of the ecological niche that it is specialized for, or adapted to. Various ways have been proposed for defining ecological niches. One view is that the niche is just a description of all the ways that a species interacts with its environment. A more quantitative view is that the niche is a small hypervolume in a multidimensional space defined by all of the parameters that can measure the environmental interactions. If the fitness of the organisms of a species -- the ability to survive and reproduce -- is defined as a function of all of these parameters, then the niche would be the hypervolume in which the fitness is above some critical value.
The general rule is that each species must have a unique ecological niche. This rule is sometimes referred to as the competitive exclusion principle. It's acceptance is in part based on experiments where organisms from two species have been placed in experimental situations where the environment is too simple to support two distinct ecological niches. In such experiments, one of the species dies out -- it is eliminated by competition with the species that is better-adapted to the particular environment. Saying that there is only one niche is equivalent to saying that the interactions between the environment and the organisms of each species are identical, which is equivalent to saying that natural selection cannot distinguish between the individual organisms as members of distinct species. The mixture of two species will in effect be a single population on which natural selection will act to eliminate the least successful individual organisms. There is nothing to stabilize the coexistence of two distinct species, and eventually the population will consist of descendents of just one of the species. The time required for this will depend on the degree of difference between the adaptation of each species to the common niche.
Note that this does not mean that two niches cannot overlap. Within a region of overlap, organisms of both species may be able to coexist, as long as the situation is stabilized by the existence of regions in which each species is better adapted.

All that this does is just lead up to the main questions: What determines how specialized organisms become, and what determines the number of distinct ecological niches? Are there many empty niches? [For important background information, read R. M. May, "How many species are there on Earth?", Science 241:1441-1449 (1988)]
Speciation is a process where one ancestrial species is succeeded in time by two distinct, specialized species with distinct ecological niches. How does this happen? What determines whether it will happen? Does it involve subdivision of an ecological niche into two, new, more restricted niches? Or, does it involve invasion of an adjacent "empty niche" without modification of the original species? Or, are other scenarios possible?

In answering these questions, it may be helpful to recognize speciation as one component of a more general process, the evolution of a community. The term community is used in a technical sense by ecologists to summarize information about what species are found in a particular ecological region and about the sizes of the populations of each species. In a particular region, the evolution of a community may involve not only evolutionary change in its component species (usually on a very long time scale), but also changes on a much shorter time scale such as changes in the sizes of the component populations, and addition of new species from outside the region. These more rapid changes are often described by the term ecological succession. Although the time scales are usually quite different, both succession and species evolution may be contributing to the evolution of a community at any point in time. We tend to think of both processes as leading to an equilibrium in which the community is saturated with species. However, this probably makes more sense for ecological succession, where the equilibrium is commonly referred to as a climax community. In areas undisturbed by man, climax communities would probably cover a large fraction of the earth's land surface. There is nothing that guarantees that a climax community will be indefinitely stable. A climax community may be disrupted by natural events such as fire, floods, landslides, diseases, herds of foragers, as well as by human interventions. In other cases, communities may become senile and spontaneously collapse. Succession can be studied after such events.
Although we do not have a complete theory to explain and predict what happens during either succession or evolution, there are a number of consistent features that have been identified:
1) During succession, it is often observed that the dominant species during early successional stages are species that seem to be best adapted to living in non-saturated communities. These species decline or disappear altogether when the community becomes saturated.
2) Large areas support more species than small areas (islands).
3) In a non-saturated community, intraspecific competition favors niche enlargement by species evolution. This is commonly seen on isolated islands.
4) In a saturated community, resource competition between species favors niche sharpening by species evolution. In some cases, predation may also become important in determining community composition, but in general it does not seem to be as important a factor as resource competition.
5) Niche sharpening is typically accompanied by reduced population size.
6) Trophic levels impose structure on communities. The use of free energy at each trophic level limits the amount available at the next trophic level, and therefore limits the number of possible trophic levels.
7) Persistent habitats support more species than ephemeral habitats.
8) A need for a minimum population size or density for the survival of a species may limit the number of species. This may be why species numbers approach a saturation level, but doesn't provide any specific quantitative predictions.
9) Obviously, a heterogeneous environment (such as a rocky tidepool area) has possibilities for more ecological niches than a more homogenous environment (such as the open ocean).
10) Specialization is favored when necessary resources are "coarse-grained" rather than "fine grained". [see (A) below]
11) Even when a resource or other environmental parameter varies continuously, a discontinuous distribution of species may be more stable than a continuous distribution of individuals. [see (B) below]
12) In sexually reproducing organisms, reproductive isolation is also an important determining factor in the evolution of specialized species, in addition to niche specialization. [see (C) below]

A) Consider again the monomolecular organisms derived from the RNA virus, Qß. Suppose we have a situation containing two kinds of replicase molecules in a tube set up for replication of Qß RNA(we don't care where they came from). Obviously, this is now a heterogenous environment which could contain two different ecological niches. So we might imagine that one possible outcome would be the evolution of two distinct RNA species, one adapted to rapid replication on replicase A, and the other adapted to rapid replication on replicase B. But will this happen? We could also imagine that if the replicases are not too different, there might be a species of RNA that could replicate on either replicase -- a generalist. Is there any way to predict the result?
let A = mole fraction of replicase A.
let B = mole fraction of replicase B.
let ra = reproductive rate of a species on pure replicase A.
let rb = reproductive rate of a species on pure replicase B.
In a mixture of replicase A and B, the mean reproductive rate for a species is:

f = A*ra + B*rb.

Consider the case where A=B=0.5, ra=0.8 and rb=0.2. This is a specialist that is better adapted to replicase A.
Then, f = 0.5. Half of the encounters with a replicase result in relatively slow replication on replicase B.

Now, a generalist might be one that could reproduce equally well on A and B, with ra= rb=0.6. Then, f = 0.6 in the environment with A=B=0.5, and the generalist would be expected to outgrow the specialist, if the outcome is determined by the reproductive rate. However, if ra>1 and rb=0.2 for the specialist, the specialist could be the winner. Obviously, there is no general rule, and the result depends on the details of what each species can do. However, we can extend this to produce a useful result. The situation that has been considered to this point can be termed a "fine-grained" environment in which the two resources (the two replicases) are encountered randomly each time the organism is reproducing. The alternative is a patchy, or "coarse-grained" environment. Suppose we have the replicases isolated in two tubes, one containing replicase A and the other containing replicase B (but without labels on the tubes). Since we don't know which tube is which, we inoculate both tubes, allow a time t for growth, and then mix them together before taking the inoculum for the next tubes. Now the result of evolution may be different:
Assuming exponential growth in each tube:
For the specialist: f = exp(0.8*t)+exp(0.2*t)
For the generalist: f = 2*exp(0.6*t)
This allows us to calculate that for t<2.4 the generalist will grow more than the specialist, but for t>2.4, the specialist will overtake the generalist. In a patchy environment, there is a high probability that the descendents of a species adapted to a particular patch will be able to continue growing on the same resource.
The generalization is that patchiness in the environment favors specialization. Of course, detailed quantitative information about each species is still required to predict the outcome in a particular case.
With higher organisms, patchiness may be exploited by behavioral mechanisms for selecting a particular resource.

B) There is an important paper by R. M. May & R. H. MacArthur (1972): "Niche overlap as a function of environmental variability" Proc. Nat. Acad. Sci. USA 69: 1109-1113. This paper examines the stability of overlapping niches along a resource gradient. If the gradient is stable in time, overlapping niches can be packed closely together along the gradient. However, if the resources fluctuate randomly, the stable result changes to one where there is a finite number of niches distributed along the gradient. The number is not very sensitive to the amount of fluctuation in the resource.

C) If we have a population of interbreeding, sexually reproducing, organisms, there will be substantial gene flow throughout the population. This will tend to counter any tendency for genetic specialization for two niches that might be potentially available to the population.
Because of this, species evolution often involves some degree of geographical isolation between portions of a population. Geographical isolation can arise if a large population becomes split by geological changes, creating two isolated populations. Such extreme and dramatic isolation is probably not very common. An idea that is especially popular with evolutionary theorists is that new species are likely to evolve in small populations on the fringes of a larger population that have only limited genetic exchange with the main part of the population. Evolutionary specialization may occur rapidly within such a small population. If it is successful, and this population starts to expand, its continued isolation from the original population then depends on the evolution of various biological mechanisms for reproductive isolation . These are also more likely to appear in small, isolated, populations. The first mechanisms for reproductive isolation to appear are usually post-zygotic mechanisms, leading to reduced viability or fertility of offspring resulting from reproduction between members of the old and new populations. These mechanisms can appear rapidly as a result of genetic incompatibilities resulting from various kinds of genetic rearrangements in the smaller, isolated population.
Post-zygotic mechanisms are wasteful, and there is then a strong selection pressure favoring the evolution of pre-zygotic mechanisms that prevent mating between members of the two populations. Such mechanisms include mate selection, behavioral and/or physical incompatibilities, differences in mating seasons or locations, etc. Once these have become fully established, the two populations can be considered to be two distinct species that can remain distinct and exploit different ecological niches, even if their geographical ranges overlap.
Species that have evolved into two distinct species remaining in different geographical areas are termed allopatric species. When they are found in the same geographical area, they are termed sympatric species. Allopatric speciation can occur as a result of geographic isolation. Sympatric speciation occurs in the absence of geographic isolation. There is considerable debate about whether this can occur to any significant extent.

Even if there were no reason to be intellectually curious about speciation, an understanding of how to distinguish different species is needed simply for the task of managing all of the information that biologists accumulate about different kinds of organisms. We need identifiers, and a system of categories, for each species. This is the subject matter of taxonomy .

Taxonomy has two components:

identification and naming of species
a system of hierarchical categories that is necessary in order to manage information about millions of species -- just as a hierarchical filing system is used to manage a large number of files on a computer disc.

Example 1 : Consider a simple example, with three organisms, 1, 2, and 3:
.....................1 2 3
Characteristic A: + + +
Characteristic B: + + 0
Characteristic C: + 0 0
Characteristic D: 0 + 0
Characteristic E: 0 0 +

The unique distinguishing characteristics C, D, and E justify the placement of each organism in a separate category. All belong to the group of "A bearers"; 1 and 2 belong to the subgroup of "B bearers". This can easily be represented by a tree diagram:

This diagram can also be interpreted as a description of history, or phylogeny. A is a characteristic that was present in the common ancestor of all 3 organisms. B is a new feature that appeared in the common ancestor of 1 and 2, differentiating it from the ancestor of 3. C, D, and E are new
features that appeared independently in each of the 3 organisms.

Example 2 : Now consider a slightly different example, with three organisms, 1, 2, and 3:
.....................1 2 3
Characteristic A: + + +
Characteristic B: + + 0
Characteristic C: + 0 0
Characteristic D: 0 + 0
Characteristic E: 0 0 +
Characteristic F: 0 + +

There is now no simple way to organize this information in a hierarchical tree. There are many situations, such as organization of information into a database, where simple hierarchical systems don't work.

The fact that hierarchical systems work quite well for categorizing organisms is an empirical observation. If they did not work well, Linneaus and others would not have proposed them and they would not be widely used. Today, the generally accepted interpretation is that organisms must fit into a hierarchical system because they evolved by straightforward phylogenetic processes involving gradual change of ancestrial organisms and creation of new species from single ancestors. Therefore, the correct taxonomic hierarchy should be based on phylogeny. Phylogenetically related organisms should be grouped together, and phylogenetically unrelated organisms should be in separate groups. This makes good sense, but ---
***Our knowledge of phylogeny is incomplete, but for practical purposes we need a complete taxonomy.
***Examples are encountered that do not fit a simple hierarchy or phylogeny, such as Example 2, above.
Consequently, most taxonomic schemes include a significant amount of speculation and hypotheses based on interpretation and evaluation of limited evidence about phylogeny.

Efforts to improve upon this situation have progressed in 2 contrasting directions:

1) "Numerical taxonomy" (aka "phenetics")
Obtain quantitative measures of as many characters as possible.
Analyze these to measure the degree of similarity between different species, or higher groups. In a case like Example 2, above, quantitative measures of the number of similarities like B vs. the number of similarities like F can be used to distinguish between the tree with a common ancestor for 1 and 2, vs. the tree with a common ancestor for 2 and 3.
No explicit attention to phylogeny, but there is an inherent assumption that if enough similarities are measured, the results must reflect phylogeny.

2) "Cladistic taxonomy" (aka "cladistics")
Absolute dependence on reconstruction of a phylogeny. An example such as Example 2, above, is analyzed carefully to understand why it doesn't fit a simple hierarchical scheme. For example, characteristic F might be an ancestrial character, like A, that was lost by organism 1 after the appearance of characteristic B. In this case, the characteristic "loss of F" is another feature of organism 1, like characteristic C, and F is not a useful feature for grouping together (2+3). Wherever possible, cladistic taxonomy is based on the appearance of new characteristics, rather than the loss of characteristics. Another possibility is that characteristic F evolved independently in organisms 2 and 3, after these two lines were distinguished by the appearance of characteristic B. In other words, the "Fness" of 2 and 3 represents analogy , rather than homology . Cladistic taxonomy emphasizes the importance of distinguishing homologies and using them to define groups.
These ideas are not new, but cladists are strong minded about applying them rigorously and explicitly, and excluding other reasons for forming categories.
There is a lot of specialized terminology used by cladists, but the only new term that we will note here is clade -- used to refer to a taxa, or taxonomic group, that is united by one or more homologies -- common features resulting from common ancestry. The alternative is grade -- used to refer to a taxa that is created for convenience to contain organisms with similarities that may have arisen independently. A good example of a grade is Arthropoda. This term was originally used as the name of a phylum, but it is now believed that the features characteristic of arthropods may have arisen independently in three or more different lines, and it is becoming more accepted to separate the arthropods into three or more phyla, so that the phyla represent clades. Obviously, the only taxonomic categories acceptable to cladists are clades!

Cladistic taxonomy has become rather controversial in the last couple of decades. Its adherents claim that it is the only objective way of finding a phylogenetically based taxonomy. It is critized for implying phylogenetic certainty in cases where sufficient information is not available, and for rejecting important information that should be incorporated into taxonomy.

The following "tree" summarizes generally accepted ideas about the phylogeny of the tetrapod Vertebrates:

Birds Crocodilians Lizards & Snakes Turtles Mammals Amphibians

A resolute cladistic taxonomist would therefore use the following taxonomic scheme:

Tetrapoda
....Amphibia
....Amniota
.........Mammalia
.........Reptilomorpha
................Anapsida (turtles)
................Diapsida
.........................Lepidosaura (snakes, lizards)
.........................Archosauria
..................................Crocodilia (crocodiles and dinosaurs)
..................................Aves (birds)

A more moderate cladistic taxonomist might accept the following scheme:

Tetrapoda
.....Class Amphibia
.....Class Mammalia
.....Class Anapsida (turtles)
.....Class Lepidosaura (snakes, lizards)
.....Class Crocodilia
.....Class Aves

The more conventional, mainstream view would use the following scheme:

Tetrapoda
.....Class Amphibia
.....Class Reptilia (turtles, snakes, lizards, crocodiles)
.....Class Aves
.....Class Mammalia

This arrangement incorporates two features that are not acceptable in strict cladistic taxonomy.

The lumping of reptiles together in a class equal to birds incorporates the idea that the magnitude of the structural, functional, and ecological changes accompanying the evolution of the birds is much greater than the differences between the several groups of reptiles.
Placing mammals at the end incorporates the information from paleontology that the mammals became a major group much later than the reptiles.