Cladistics II: Phylogenetic Inference

Characters and Character States
Hennig recognized that finding direct ancestors in the fossil record would be hard, and demonstrating that the fossil was a direct ancestor rather than a close relative of that ancestor would be even harder. So instead, his method focused on estimating the pattern of shared common ancestry. In order to make that estimation, cladistics looks at the distribution of derived characters [evolutionary specializations]. Here is how it works. Below is a grouping of some modern animals:

Each set contains multiple taxa. Each set is characterized by derived characters not seen in the taxa outside that set. For example, the lizard shares with all the others the presence of a five-fingered hand and claws. The platypus and all the others except the lizard shares fur and milk. Since lizards (and alligators and frogs and fish and many other animals not shown here) do not have fur and milk, it is most likely explained by having evolved in the common ancestor of platypus, zebra, bear, tiger, and lion AFTER that ancestor had diverged from the common ancestor of lizard, platypus, zebra, bear, tiger, and lion. In other words, platypus, zebra, bear, tiger, and lion most likely had a more recent common ancestor with each other than any did with lizards.

It is generally easier to show this pattern using a simplified stick figure called a cladogram than the set diagram. Below is the cladogram showing the same information as the sets above:

The little red "tick marks" show that the derived character state appears at that point of the cladogram, and is passed on upwards. To show what that means, see the following:

So the common ancestor of zebra, bear, tiger, and lion evolved a placenta and live birth, and passed it on to its descendants. And the common ancestor of bear, tiger, and lion (but NOT zebra) had its molars evolve into carnassials (shearing teeth), and passed that trait on.

Not all character states are useful in figuring out evolutionary relationships. Primitive (or ancestral) states do not help. For example, both lizard and platypus lay shelled eggs rather than have a placenta and live birth. Does that mean that these two animals shared a more recent common ancestor with each other than either did with the other animals here? NO! In fact, having a shelled egg is the ancestral state for a larger (more inclusive) group of animals, including birds, crocodilians, turtles, snakes, echidnas, etc. Having a shelled egg is simply the condition (state) of all advanced terrestrial vertebrates. The technical term for a shared primitive character is a symplesiomorphy.

However, fur and milk are derived states not found in terrestrial vertebrates other than mammals, and indicate a common ancestor to these animals not found in lizards, birds, crocodilians, turtles, etc. The technical term for a shared derived character is a synapomorphy

Similarly, sharing a five-fingered hand doesn't unite the lizard, platypus, and bear (for example) exclusive of the zebra. A five-fingered hand is a symplesiomorphy, and will be passed on that way unless it evolves into something else.

(NOTE: a character can be "primitive" at one level, but "derived" in a different context. In this cladogram, retractable claws is a derived feature that shows us that lions and tigers share a more recent common ancestor with each other than with anything else on this cladogram. However, if we were trying to figure out the relationship between lions, tigers, jaguars, and leopards, than retractable claws would be a primitive charcter shared by all of them, and so useless.)

Zebras have only a single finger on each hand. This is an autapomorphy or unique derived character. Although autapomorphies might be useful in understanding how that animal lived, they don't help us figure out where they fit in the cladogram, since those traits evolved after the ancestor of (in this case) zebras diverged from any shared ancestor of any of the other animals on the list. (However, if we added horses or donkeys to this cladogram, a single finger on each hand would be a good shared DERIVED character, and help us figure out that horses and donkeys share a common ancestor not shared with bears, lions, tigers, etc.).

Convergent characters could actually mislead us. For example, tigers and zebras both have stripes, whereas the other animals on this list don't. Does that mean that tigers and zebras share a common ancestor not shared by the other animals? We might think so at first, but the shared presence of carnassials in bears, lions, and tigers (but not zebras) and of retractable claws in lions and tigers (but not zebras) indicate a different pattern of ancestry. Convergent characters can be very annoying, because they might cause us to make the wrong estimation of the cladogram.

An additional type of character (not seen here) are reversals: when an ancestor had a derived condition, but the trait evolved "back" to the primitive condition. For example, snakes are descendants of typical legged lizards. But snakes have evolved the loss of limbs, so that they resemble the limbless ancestor of vertebrates. For another, while the common ancestor of living birds could fly, but ostriches, rheas, cassowaries, etc. have reverted to the primitive flightless state. Reversals that are unique characters do not help in estimating evolutionary relationships, but ones which are shared might. In general, though, reversals (like convergent characters) are annoying and are a major source of error for our analyses. Collectively, we call convergent characters and reversals homoplasies.

So, in review:

How do we tell which states (conditions) of a character are ancestral, and which are derived? One common method is to look at outgroups: more distant relatives of the taxa in question. Looking at fish, amphibians, turtles, lizards and snakes, crocodilians, and birds show that no placenta is the primitive condition among vertebrates, and that the placenta of some mammals is a derived condition. (It is best to look at multiple outgroups.)

Inferring Evolutionary Transformations and Dealing with Missing Data
No fossil is complete, so we can't always directly tell what character state was present in a given species. But we can sometimes us the phylogenetic position of a fossil to help us estimate what that state was. Below is a cladogram of tyrannosauroids (tyrant dinosaurs), with the number of fingers on the hand listed:
The hand is not yet known for Alioramus or Alectrosaurus. However, we know that Dilong has the ancestral condition of a three fingered manus (shared with many other carnivorous dinosaur groups), and that Gorgosaurus, Albertosaurus, and Tyrannosaurus all share the evolutionary specialization "loss of manual digit III" (meaning that they had a two-fingered hand). The simplest explanation is that the concestor of these three dinosaurs had already lost manual digit III, and passed on this derived trait to the different descendant lineages.

Because Alioramus (based on other evidence) is ALSO a descendant of that concestor, we predict that it too had a two fingered hand. To assume it had a three fingered hand would mean assuming an evolutionary change (in this case, a reversal) for which we had no direct evidence. The simplest explanation for our current observations is that Alioramus was a two-fingered dinosaur.

But things are unclear for Alectrosaurus. It might be that the loss of manual digit III occured after the ancestor of Alectrosaurus diverged from the concestor of Alberosaurus through Tyrannosaurus: if that were the case, Alectrosaurus would have a three fingered hand. OR it might be that the loss of manual digit III happened before the Alectrosaurus lineage and the Albertosaurus + Tyrannosaurus lineage had diverged: in that case, Alectrosaurus was a two fingered dinosaur. So at present, things remain ambiguous for this taxon.

If we did have to choose, however, there are two different alternative strategies (or optimizations) for making this decision. Under the accelerated transformation (ACCTRANS) optimization, we assume the earliest possible parsimonious transformation (and thus emphasize gains and losses rather than convergence). Under the ACCTRANS model, Alectrosaurus would be optimized as having the two-fingered hand state. Alternatively, there is the delayed transformation (DELTRANS) optimization, in which we assume the latest possible parsimonious transformation (and thus emphasize convergence over gains and losses). Under DELTRANS, Alectrosaurus would be optimized as having a three-fingered hand state.

The Extant Phylogenetic Bracket
For any fossil form, we can find the two closest outgroups which are still extant. These represent the extant phylogenetic bracket (or EPB) of the fossil taxon. If we hypothesize that the extinct form has a particular soft tissue structure, behavior, or other character that cannot be directly observed from the fossil record, this inference falls in three categories of decreasing confidence:

Below are examples of each of these:

In the first case, the basal horse Eohippus is bracketed by Ceratomorpha (rhinos plus tapirs) and Equus (the living horse genus). Both of the EPB possess fur, so the inference that Eohippus was furry is a secure Type I inference.

In the second case, the therapsid "protomammal" Thrinaxodon is bracketed by living mammals (which possess fur) and living reptiles (which do not). We might infer that Thrinaxodon was furry, but this is a less-secure Type II inference. Of course, preservation in a Konservat-Lagerstatt might someday show that Thrinaxodon had fur (direct fossil evidence) OR there might be a similar discovery of fur in a taxon more distantly related to mammals than Thrinaxodon (which would make the furry Thrinaxodon hypothesis a Type I inference, with an extinct but well-preseved taxon sitting in as part of the bracket).

In the final case, some paleontologists have hypothesized that the bird-hipped herbivorous dinosaurs in Ornithischia had mammal-like cheeks. While this is argued on circumstantial evidence, it represents the least-secure Type III inference using the EPB method, as neither members of the EPB (crocodylians and birds) have cheeks.

Consensus and Tree Support
Because of missing information, convergence, and reversals, there can be confusion as to the actual pattern of the Tree of Life. It is often the case that multiple different cladograms are equally well supported ("equally parsimonious"). In those cases, we can list all of them separately, or we can show a "consensus cladogram" that shows the ambiguity. This is seen by the presence of a polytomy: a node with more than two branches coming out of it. Below are three different equally simple cladograms for the ornithischian (bird-hipped) dinosaurs, and the consensus cladogram below:

Note the polytomy in the consensus cladogram.

There are several different strategies for generating consensus trees. Each maximizes the information in a different way, but like map projections of a globe onto two dimensions, they all lose or distort some element of the data in some fashion. Here are some of the main types of consensus trees:

Additionally, there are different methods for evaluating how secure the different results (i.e., the nodes generated) of the analysis are. The two main methods are Bremer Support (also called decay index):

and bootstrap value:

Names on Trees: Phylogenetic Systems of Nomenclature
Paleontologists and other biologists have started to use phylogenetic definitions for taxa: that is, they specify a particular relationship and any taxon that qualifies under that relationship is member of that clade. Biologists have coined the word concestor for most recent common ancestor. While the chances of finding the concestor of any two (or more) clades is slim, we can recognize that there did exist such a species somewhere in time. This recognition gives us a way of defining taxa.

Paleontologists sometimes find it useful to refer to the crown-group: that is, the clade formed by all descendants of concestor of the living members of a clade. The total-group is the clade comprised of the crown-group plus all taxa sharing a more recent common ancestor with it than with the next most closely related crown-group. Finally, the stem-group is the paraphyletic part of the total group exluding the crown group. In the example below, the total-group Cephalopoda includes the modern Nautilida and Coleoidea and all taxa more closely related to them than to Scaphopoda. Nautilida and Coleoidea define the crown-group, which also contains the extinct clade Ammonoidea (which although extinct is nevertheless a descendant of the concestor of nautilids and coleoids.) Plectronocerida and Endocerida, however, belong to the stem-group of cephalopods.

However, we can go beyond simply talking about crown- and stem-groups. For example, Amniota is defined as "the concestor of Mammalia and Reptilia, and all of its descendants." And Dinosauria is "the concestor of Iguanodon and Megalosaurus, and all of its descendants.". So any taxon that comes from the node (joining point) that links Megalosaurus and Iguanodon is a member of Dinosauria (i.e., is a dinosaur); and anything that lies outside that node is NOT a dinosaur. So in the cladogram below:

Cetiosaurus, Hylaeosaurus (and, by definition, Megalosaurus and Iguanodon) are all dinosaurs, but Lagerpeton and Pterodactylus are not.

Names coined in this fashion are said to have node-based taxonomic definitions. These have the form "the concestor of A and B and all of its descendants" or more generally "the least inclusive clade containing A and B". (A and B are said to be the specifiers or anchor taxa.)

Names can also be be defined another way: by sister group relationship rather than by concestry. For example, the two major branches of Dinosauria are Saurischia and Ornithischia. The former is defined as "Megalosaurus and all taxa sharing a more recent common ancestor with Megalosaurus than with Iguanodon", and the latter is defined as "Iguanodon and all taxa sharing a more recent common ancestor with Iguanodon than with Megalosaurus". So Cetiosaurus and Megalosaurus are both saurischians, and Hylaeosaurus and Iguanodon are both ornithischians. These names have branch-based taxonomic definitions. These have the form "A and all taxa sharing a more recent common ancestry with A than with B" or more generally "the most inclusive clade containing A but not B".

Similarly, the clade Dinosauromorpha is defined as Dinosauria and all taxa sharing a more recent common ancestor with Dinosauria than with Pterodactylus. Showing those relationships (by labeling the name, and by color blobs) onto the cladogram, we see:

In principle, one can also coin apomorphy-based definitions. These have the form "The clade stemming from the first organism or species to possess apomorphy M as inherited by A" or "the most inclusive clade exhibiting character state M that is synapomorphic with that in A". For example, in the diagram above, Hylaeosaurus is unique in possessing osteoderms (armored bony plates in its skin). One could define a clade "Hylaeosauroidea" with the form "The clade stemming from the first species possessing osteoderms as inherited by Hylaeosaurus". However, most practioners of phylogenetic systems of nomenclature avoid this, as apomorphies almost always exhibit transitional stages, which makes the dividing point operationally difficult to define.

Minimum Divergence Dates: Phylogeny Meets Biostratigraphy
On a cladogram we don't normally see numerical time. Because the nodes on a cladogram represent the presence of shared common ancestors, they are guide to the relative sequence of divergences. So in the cladogram of birds and bird-like dinosaurs:

the concestor of Eumaniraptora + Oviraptorosauria must have lived more recently (closer to us in time) than the concestor of (Eumaniraptora + Oviraptorosauria) + Alvarezsauria. (After all, the latter concestor is the ANCESTOR of the (Eumaniraptora + Oviraptorosauria) concestor!) But we can't tell numerical time from a cladogram.

However, we can plot the known stratigraphic ranges (that is, the range between the oldest and youngest specimen) of each taxon. Those are the black lines on the phylogeny to the right above. The dashed red lines represent the cladistic relationships between the taxa. We see that the divergence between Eumaniraptora and Oviraptorosauria had to have happened sometime before the oldest member of these sister taxa (that is, the mid-Middle Jurassic eumaniraptorans). The divergence between Alvarezsauria and the (Eumaniraptora + Oviraptorosauria) had to happen earlier than that. We don't know the exact divergence date unless we actually find the concestors; but we can find the minimum divergence date (in both these cases, the Middle Jurassic).

The phylogeny also shows us other information. We see that Oviraptorosauria is first known from the middle part of the Early Cretaceous. Assuming our cladogram is correct, there should be Middle Jurassic, Late Jurassic, and early Early Cretaceous representatives of the Oviraptorosauria lineage.

To Syllabus.

Last modified: 19 August 2016