In my previous post I simulated binary morphological trait data to evaluate the prevalence of cryptic diversity for morphologically complex and simple organisms. Here I aim to do the same thing for measurable (continuous) traits, which are often more abundant in algae  (i.e., there are more measurable traits than discrete traits).

In addition, I want to look in more detail at how directional selection may influence the diagnosability of species. This is relevant because it is well known that habitat can have a profound effect on algal phenotypes. Finally, I will investigate how habitat-induced phenotypic plasticity affects species diagnosability. As before, I will tackle this problem with simulations of morphological trait evolution (see last week’s post).

Simulation 1: Effect of number of traits on species diagnosability

With the first set of simulations, I want to check if last week’s conclusion that more complex lineages have a lower prevalence of cryptic species is also valid for measurable (continuous) traits. To do this, I simulated the evolution of continuous morphological traits evolve along a species tree. The simulation protocol is as follows:

  1. Simulate a Yule species tree (pbtree from phytools package) and rescale to have root-to-tip length of 1.
  2. Simulate evolution of the desired number of traits along the tree. I simulated under a simple diffusion process (Brownian motion model, σ2 = 1.0) using OUwie.sim from the OUwie package for this. Seems like using a bazooka to kill a mosquito, but the choice for OUwie will become clear below.
  3. The result of the previous step is a set of trait values for each species.
  4. Loop through all species pairs and see how many can be distinguished from one another based on the trait values.

This overall procedure is similar to what I did for discrete traits, but there are a couple of important differences…

First, it’s no longer possible to count the number of distinct morphologies. Traits that vary along a continuous scale will never be exactly the same so the concept of “unique morphology” doesn’t make sense anymore.

Second, I needed to come up with a way to have a realistic amount of intraspecific variation of the continuous traits in the generated datasets. The simulations return only a single trait value for each species. To solve this, I looked at my Halimeda morphometric datasets and noticed that the standard deviation of traits is typically about 15% of the mean value for those traits. So, to get variation of intraspecific trait values, I used a normal distribution with the simulated trait value as the mean and 15% of this value as the standard deviation. Not a particularly elegant way of simulating phenotypic variance in populations, but good enough for the purpose…

Lastly, for step 4 of the procedure, we need to calculate the percentage of species that can be distinguished from one another. This is easy for discrete traits (the character combinations of the two species are either identical or different), but quite difficult for continuous traits. How different do two species need to be to call them morphologically distinguishable? I decided to sample 20 values from the distribution of each trait (i.e., the normal distribution explained in the previous paragraph). This is an attractive solution because it is equivalent to constructing a morphometric dataset by taking measurements of all traits on 20 randomly selected samples from each species. Then, I compared the two species trait by trait. If one (or more) of the traits had non-overlapping ranges, the species were considered as distinguishable. In fact, I used the range between the 2.5 and 97.5 percentile of the sampled trait values to allow for a tiny bit of overlap. If there was overlap between the ranges of all traits, the species were considered indistinguishable.

Now let’s get back to the simulations. I started by running a simulation for 10 traits and 20 traits to see if simple organisms are harder to distinguish from each other than complex organisms. The number of taxa in the simulated trees was varied between 10 and 100 and the outcome was summarized into a boxplot. Remember that we previously saw that the number of species does not affect the percentage of distinguishable species, so a boxplot suffices to summarize the results. Here are the results:


As expected, the percentage distinguishable species is higher for complex organisms (72.5 % for organisms with 20 characters) than for simpler organisms (54.2% for organisms with 10 characters). This is congruent with what we found for discrete characters.

Simulation 2: Effect of habitat-induced selection on the phenotype

The second thing I wanted to look at is how selection on morphological traits would influence how easy it is to do distinguish species based on morphological traits. In this second set of simulations, I followed this procedure:

  1. Simulate a Yule species tree (pbtree from phytools package) and rescale to have root-to-tip length of 1.
  2. Simulate in which of five possible habitats the species reside. This is done by “simulation mapping” of a discrete trait with 5 states (representing 5 habitats) using the sim.history function in phytools. The rate of the Markov process controlling habitat evolution was set at 0.3 and it was enforced that all habitats are occupied at the end of the simulation.
  3. Simulate evolution of the desired number of traits along the tree.
    1. Half of the traits were simulated as before (no selection, simple Brownian motion model, σ2 = 1.0).
    2. The other half of the traits were simulated under directional selection, with an Ornstein-Uhlenbeck model that evolves towards different optimal trait values depending on which habitat the lineage in question occupies. Parameter values were α = 0.5, σ2 = 1.0 and θ = [1, 3, 5, 7, 9]. In other words, if a lineage is in habitat #1, the trait will be pulled towards the optimal value of θ1 = 1 with a strength of α = 0.5. For habitat #4, this would become a pull towards θ4 = 7 of the same strength α. The state at the root of the tree (θ0) was set at 5 (i.e., the median of the θ vector).
    3. OUwie.sim from the OUwie package was used to carry out the simulations.
  4. As before, the result of the previous step is a set of trait values for each species.
  5. Loop through all species pairs and see how many can be distinguished from one another based on the trait values, again using the procedures described above.

Here’s what came out of this simulation:


Pretty cool. There’s an increase of how many species can be distinguished from each other in both cases. While the increase from 54.2 to 59.7 for the 10-character situation is obviously not significant, the increase from 72.5 to 85.8 for the more complex organisms certainly is. I had not expected this result. I had expected a decrease. After all, habitat selection drives morphological traits to certain “optimum values”, and such traits would thus not contribute to distinguishing between species that live in the same habitat.

The reasoning above is true, but incomplete. Only 50% of the characters are driven towards optimum values while the other 50% evolve free from selective forces. Selection subdivides the morphologies into five habitat-specific categories, thereby subdividing the species distinguishability problem into five smaller sub-problems (one for each habitat). These smaller subproblems are easier to solve with the remaining characters that are not under selection, leading to an overall increase of species distinguishability compared to the simulation without selection.

Simulation 3: Effect of phenotypic plasticity in response to habitat

Clearly, selection is only part of the story. So far, I have assumed that every species lives in a single habitat. In most organisms, and this is certainly true for algae, one also has species that live in multiple environments and feature adaptive morphological plasticity in response to those environments.

The effect of plasticity in response to habitat is harder to simulate using the type of approach I’ve chosen, but here’s the simulation design I came up with:

  1. Simulate a Yule species tree (pbtree from phytools package) and rescale to have root-to-tip length of 1.
  2. Simulate which of five habitats the species live in as in the previous simulation.
  3. Simulate a binary trait to create lineages with and without phenotypic plasticity.
    1. Perform “simulation mapping” of a binary trait along the tree, where one state denotes plastic and the other non-plastic. This was done with sim.history (phytools).
    2. For simplicity and to avoid difficulties associated with plastic species returning to non-plastic, I forced the root state to be non-plastic and only allowed changes from non-plastic to plastic. The latter was achieved by setting the plastic to non-plastic rate to 10–10. The non-plastic to plastic rate was 1.0.
    3. I also forced the fraction of plastic and non-plastic species to be similar (at least 1/3 plastic and at least 1/3 non-plastic) by repeating the simulation mapping until this condition was met.
  4. Simulate evolution of the desired number of traits along the tree.
    1. Half of the traits were simulated without selection (Brownian motion model, σ2 = 1.0).
    2. The other half of the traits were simulated under directional selection with an Ornstein-Uhlenbeck model as described above (simulation 2).
    3. The difference with the simulation above is that lineages that show phenotypic plasticity were assumed to occupy all five habitats. For these lineages, five separate evolutionary tracks were simulated, i.e. one towards the optimum of each habitat.
    4. OUwie.sim from the OUwie package was used to carry out the simulations.
  5. The result of the previous step is a set of trait values for each species.
  6. Loop through all species pairs and see how many can be distinguished from one another based on the trait values, again using the procedures described above.

What’s different from before is that instead of having one mean trait value per species, we now end up with five mean trait values for plastic species (because they were simulated along 5 evolutionary tracks towards different optima). So I sampled 4 values from each of the corresponding five distributions (normal, mean = simulation outcome, standard deviation = 15% of mean). This resulted in 20 trait measurements for comparison to other species in step 6.

Here are the results:


Neat. The species distinguishability clearly drops from the condition with selection and without plasticity (59.7 to 48.6% for the simpler organisms and 85.5 to 69.7% for the more complex organisms). In other words, plasticity has a strongly negative effect on the potential to recognize species based on their morphology. Any advantages brought about by habitat selection (i.e. subdivision of the species distinguishability problem into sub-problems) are completely wiped out by the presence of species that have distinctive morphologies in the different habitats they inhabit.

Wrapping up

That was an interesting set of experiments. Let me just recapitulate the most important results:

  1. Species from character-poor lineages are more difficult to distinguish from one another than species from character-rich lineages.
  2. Selection towards habitat-specific phenotypic optima increases rather than decreases our ability to distinguish between species.
  3. Habitat-determined phenotypic plasticity within species greatly reduces the likelihood that one can distinguish between species based on morphology, even in complex organisms.

Obviously, these are just a handful of simulations, and I don’t expect these results to be valid across a wider range of parameter settings. For example, I would expect that point 2 may not hold if a greater proportion of characters are under selection. I would also anticipate that the relative importance of the drift (σ2) and directional (α) components of the Ornstein-Uhlenbeck model may change things. Perhaps I will explore this further for another post. Or you could do it yourself.

You can download the code for these simulations from here.

These results are also presented in a paper that is about to appear in Journal of Phycology. [UPDATE: This paper is now out here. A PDF is available here]


Algae have the annoying tendency to show high levels of cryptic diversity, i.e. with distinct species being morphologically indistinguishable. This has been shown repeatedly by first assessing species boundaries using DNA work or crossing studies, and subsequently comparing these species boundaries with morphological features.

I’ve always been interested in how morphological complexity of organisms relates to their tendency to produce cryptic species. When we found cryptic diversity in Pseudochlorodesmis, a genus in which the algal body is utterly simple, we argued that this may be due to its simplicity: “From a strictly morphological point of view, it is simple to conceive that the potential prevalence of cryptic diversity within any given taxon is a function of its morphological complexity. For example, if the morphology of the members of the taxon can be scored as a set of X binary characters, and morphological species boundaries are defined by a minimum of one character difference, the maximum number of morphologically determinable species increases exponentially with the number of characters available (N = 2X). In other words, for a higher taxon containing a given number of species, chances of encountering cryptic diversity increase dramatically with decreasing morphological complexity.” (Verbruggen et al. 2009 J. Phyc. 45: 726-731)

Of course, the N = 2X is a theoretical maximum, and I wouldn’t expect all theoretically possible morphologies to be produced in the course of the evolution of a lineage. To look at this in some more detail, I’ve done a few simulations. This approach consists of generating phylogenetic trees containing a number of species, and subsequently letting a set of traits (i.e. morphological characters) evolve along this phylogeny at a rate that corresponds to those measured for a real algal morphometric dataset. The result of this exercise is a set of values for each trait for each species in the phylogeny. Those can then be compared with each other to evaluate how many unique morphologies there are and how many of the species can be reliably distinguished from one another morphologically.

First, I wanted to quantify how fast your average discrete morphological trait evolves in algae. So I took one of my morphological datasets for Halimeda (mostly unpublished, but similar in nature to Verbruggen et al. 2005 J. Phyc. 41: 606-621) and a corresponding phylogenetic tree of the species in that dataset. The tree is a chronogram, which was rescaled to have a root-to-tip path length of 1. Five of the variables in the dataset are discrete, and I calculated the rate of the Markov process for these using the fitDiscrete function in the geiger package for R. Here are the results:

> print(mkr)
   perwall perfusions    secinfl     secper   segundul
 5.0208161  1.0331563  0.8157761  5.4067015  0.1252521

Cool. The evolutionary rates of the traits vary quite a bit. I decided to start the simulations with the lowest of these rates (I used 0.1), and then increase the rate later on.

Here’s a breakdown of the simulation function:

    • simulate Yule tree (pbtree from phytools package)
    • rescale tree to have root-to-tip length of 1
    • simulate the evolution of the desired number of traits along the tree (rTraitDisc from ape package)
    • count the number of unique morphologies (trait combinations) produced during the simulation
    • loop through all species pairs and score how many are distinguishable from each other (two species are considered distinguishable if they have at least one trait that differs between them)

This procedure was repeated for trees containing different numbers of species (from 10 to 400), with the number of unique morphologies and the fraction of distinguishable species pairs being retained at each step. Now let’s plot some results…


That’s quite spectacular. At this rate of trait evolution you get MUCH fewer unique morphologies than there are species. For organisms with 20 traits, you get only about 50 unique morphologies even though there are 400 species. That’s a lot of cryptic diversity. For organisms with 10 traits, the situation is even worse and only about 20 unique morphologies are produced for 400 species. Okay, now let’s plot the percentage of distinguishable species pairs…


The blue triangles (20 characters) clearly lie above the golden dots (10 characters), reflecting that organisms with lower morphological complexity have fewer unique morphologies and are harder to distinguish from one another. In other words, lower morphological complexity leads to higher levels of cryptic diversity. That was expected, but nice to see it confirmed in the simulation.

Another interesting feature of this graph is that there is no relationship between the number of taxa in the tree and the percentage of distinguishable species (flat lines). At first, this seemed counterintuitive to me. When given a certain amount of time to diversify (one time unit from root to tips), and with a fixed rate of morphological evolution, shouldn’t more diverse lineages have more species that look the same? Actually, no, because trees with more species have a higher total tree length. The root-to-tip distance is still the same, but you have more lineages that add to the total tree length and thus to the total amount of evolution in the morphological trait. So those flat lines do make sense.

I’ve also tried these simulations with different rates of evolution:


Clearly, traits with higher rates are better at distinguishing between species than slower traits, reducing the number of cryptic species in a lineage.

In conclusion, let me just wrap up the core results from this exercise:

    1. There are substantially fewer unique morphologies than there are species.
    2. Character-poor lineages produce fewer unique morphologies than character-rich lineages.
    3. Lineages with fast-evolving traits feature less cryptic diversity than those with slow-evolving traits.

Because of these results, we can expect cryptic diversity to abound, especially in character-poor lineages. As such, for any given algal taxon, we should expect to be unable to distinguish between at least some and possibly many of its species based on morphology alone.

Some of these results are presented in a paper that is about to appear in Journal of Phycology. [UPDATE: This paper is now out here. A PDF is available here]

You can download the code for these simulations from here.

These are the notes for a presentation I just uploaded to SlideShare. I gave this as a seminar at the University of Melbourne last Tuesday and at LaTrobe University two days later.


Slide 1

  • introduce

Slide 2

  • outline of the talk

Slide 3

  • student Dioli Payo
  • genus Portieria
  • pretty thallus shape – well-described with fractals
  • only two species known worldwide
  • her goal was to look at population structure of the species

Slide 4

  • her sampling localities in the Philippines

Slide 5

  • sequenced rapidly evolving marker from mt genome
  • we applied GMYC to the data
  • this is a quick ‘n dirty method to detect species boundaries
  • she found 21 species instead of 1
  • these are distinct species that have been separated for millions of years
  • species are cryptic => impossible to distinguish morphologically
  • they have limited distribution ranges, often a single island or bay

Slide 6

  • what does this mean globally?
  • our global sampling is not nearly as good
  • we’re at 50 species and counting
  • difficult to extrapolate but it could be well over 100 spp

Slide 7

  • we have a situation where much of what we think we know about species diversity is wrong
  • not only the case for Portieria, we know this is true for many algae, although perhaps not as spectacularly high diversities
  • what does this mean…
  • as a taxonomist to describe => they all look identical
  • every conservation decision that has ever been made that involves seaweeds needs to be revisited
  • more work for me at all levels: (1) difficult to study biodiversity patterns in meaningful way, (2) cannot trust a single species record from the literature or from online databases, (3) much denser sampling is needed in the field and DNA sequencing for every single specimen

Slide 8

  • move on to biodiversity
  • focus on understanding processes => diversification
  • geographic and ecological dimensions

Slide 9

  • our approach consists of this

Slide 10

  • our approach for the more visually inclined
  • start with phylogeny calibrated in geological time

Slide 11

  • add information about contemporary species
  • in this case macroecological: sea surface temperature

Slide 12

  • inference about past using models of evolutionary change
  • this way we can study how evolution of thermal affinities relate to figure below
  • since the phylogeny includes speciation events (bifurcations) we can relate niche evolution to diversification

Slide 13

  • these are the three model systems we’ve developed
  • very dense global sampling
  • starting to get to grips with what the species are and where they occur

Slide 14

  • start with geographic patterns of diversification

Slide 15

  • we aimed for general patterns, not individual case studies
  • hence focus on entire order of brown algae, the Dictyotales
  • you see some of the genera illustrated here

Slide 16

  • they are Olivier’s pet group so we know a lot about them
  • distributed worldwide across tropics and temperate water
  • we have > 2000 barcoded or accurately identified specimens belonging to 236 species
  • gives us pretty good idea of the distributions of the species

Slide 17

  • we want to know…
  • we have …
  • so we need a window into the past to see what happened

Slide 18

  • as explained before, models of evolutionary change offer a solution
  • relevant evolutionary events are parameters in the model, which is then optimized
  • with optimized model, we can infer things about the evolutionary events and estimate the ancestral situation
  • for biogeography => relevant parameters relate to how species move around
  • simple form with areas A/B
  • explain parameters for dispersal-extinction-cladogenesis
  • generalize to more areas
  • what it can do => phylogeny + current distribution => biogeographic history

Slide 19

  • we did this for Dictyotales
  • simple subdivision of world in three biogeographic regions: northern temperate, tropical, southern temperate
  • remember colors

Slide 20

  • change to Preview (cf. next page)

Slide 21

  • zoom in on terminal species, legend corresponds to colors in slide 19
  • reconstructed ancestral states are also there
  • show example of speciation associated with S to N shift
  • show example of speciation within region
  • base of Dictyoteae: temperate southern hemisphere
  • some lineages stay there (e.g. Dilophus)
  • at base of Dictyota more generalist
  • gives rise to a mixture of tropical and temperate lineages
  • top lineage: origin is tropical, moves into N temp on several occasions
  • next lineage down: all temperate, with S origin, dispersing into N
  • lineage all the way at bottom: starts in tropics, moves into S, later moves from S into N

Slide 22

  • tree is great to look at specific cases but doesn’t global picture
  • these are summary graphs
  • dispersal rate through time => 3 types are substantially higher than others
  • movement out of tropics
  • movement from S to N

Slide 23

  • put this in perspective
  • slide shows decreasing SST through Cenozoic
  • narrowing tropical belt
  • more temperate habitat opening up in S and N
  • movement from tropics to temperate
  • north is major sink because there was almost no temperate habitat => tropics and S feed into N

Slide 24

  • move on to macroecological correlates of diversification

Slide 25

  • case study Halimeda
  • diversity map => high diversity in tropics, with a few species in temperate habitat
  • so where is the origin? tropics or temperate
  • how often to niche shifts between temperate and tropical occur?

Slide 26

  • we have lots of DNA barcodes
  • we get SST for localities using satellite imagery
  • we get an idea of SST affinities of species
  • how do affinities evolve?

Slide 27

  • similar methods as before => model optimized
  • every tip is species
  • color gradient shows SST affinities
  • tropical origin
  • marker conservatism for tropical SST in clades 2-5
  • conservatism lostin clade 1 => 4 transitions into temperate
  • in perspective: show time frame and correspondence to narrowing tropics

Slide 28

  • do these modes of speciation and the shifting niches have implications for the distribution of biodiversity on the planet?

Slide 29

  • typical diversity patterns: well-characterized => bell-shaped around tropics
  • many possible explanations
  • my goal is to provide macroevolutionary perspective
  • higher species turnover in tropics => higher rate of diversification

Slide 30

  • seaweeds don’t follow general rules => bimodal diversity pattern
  • do same evolutionary processes hold or is diversification faster in temperate habitats?

Slide 31

  • Codium is suitable case study with similar diversity map

Slide 32

  • evolution of SST affinities traced along phylogeny
  • clade 3: almost half of all species in young clade, only 25 Ma
  • seems to be associated with move from temperate into tropics

Slide 33

  • logical question: is diversification faster in tropics

Slide 34

  • model of diversification dynamics in which diversification is function of SST

Slide 35

  • optimum value of beta => positive association between SST and diversification
  • higher rates in tropics
  • so process seems similar to other organisms and reasons for bimodal diversity pattern has to be sought elsewhere

Slide 36

  • so why is Codium richer in colder water?
  • probably due to historical causes
  • origin is in temperate waters and a lot of the branches remain in those temperate waters
  • it appears that the genus has only invaded the tropics recently and that, because of that, the majority of species is still in temperate water

Slide 37

  • no such thing for Dictyota => constant diversification explains it better

Slide 38

  • previous test only checked for very simple relationship between SST and diversification
  • many other types of relationships you could imagine
  • for example one could expect that clades whose niches are more evolvable manage to diversify more rapidly
  • we do seem to find that in Dictyota
  • split phylogeny up in major clades
  • positive relationship between rate of SST evolution and diversification
  • slope very deviant from that simulated under null model

Slide 39

  • lineages with many allopatric sister species along latitudinal thermal gradient diversify more rapidly
  • we seem to have a situation where clades that some clades manage to speciate more often along the latitudinal thermal gradient than others
  • clades that do, diversify more rapidly, probably because their presence in both temperate and more tropical habitats permits further radiation in those habitats

Slide 40

  • so, we saw that evolvability of the macroecological niche leads to more rapid diversification
  • where does that evolvability come from?

Slide 41

  • student Vanessa Marcelino was studying the evolution of microhabitat traits and macroecological traits in Halimeda
  • she decided to investigate in more detail whether there could be an interaction going on between micro and macro
  • Halimeda is mostly tropical and of tropical origin
  • found in different habitats on coral reef
  • exposed wave-swept and more sheltered e.g. reef slope but also lagoon
  • one could expect that SST evolution is faster for species in exposed microhabitats because they experience more extreme environments (low tide, wave action, etc)

Slide 42

  • compare model in which rate of SST evolution is constant with one in which it depends on whether or not species lives in exposed habitat

Slide 43

  • 2-rate model performs considerably better
  • difference in AIC 7.6 => integrated across uncertainty in exact pattern of evolution of microhabitat preference
  • lineages from exposed habitat 4.3x faster
  • so, it appears that microhabitat specializations can be exaptations for macroecological shifts

Slide 44

  • wrap up
  • for speciation, no “one rule fits all” => examples of everything you can imagine (allopatric vs. within region, associated with niche shift vs. conservatism)
  • for distributions, some patterns did come out => tropics act as source, with confirmation of “out of the tropics” hypothesis for Dictyotales; north is major sink because so recent
  • for diversification, all kinds of things going on: (1) simple relation with historical effect in Codium, (2) role of evolvability in Halimeda and Dictyota, (3) I think the evolvability aspect may emerge as a general pattern as more taxa are studied
  • reach out => (1) better models can be designed, (2) evolutionary dimension is applicable to any problem that any biologist is working on

Slide 45

  • these folks did the hard work

Slide 46

  • funding agencies
  • collaborators and collectors => due to the dense sampling that we need, lots of samples are required, and we could not do what we do if it wasn’t for all these people volunteering their time

I just came across a very interesting opinion paper titled “No name, no game” published in the European Journal of Taxonomy.

The paper was written by Yves Samyn of the “Belgian National Focal Point to the Global Taxonomy Initiative” (I think we all agree they need an acronym) and Olivier De Clerck of Ghent University. I’ve known Yves since we were both on a field trip in KwaZulu-Natal (South Africa) many years ago, and Oli is a great colleague and friend who I’ve worked with very closely for over ten years.

They argue that, in contrast to what Joppa et al. (2011) claim, today’s taxonomic workforce is not sufficiently large to describe the remaining pool of missing species within a reasonable amount of time. This is in the first place because much larger numbers of species remain to be described for many understudied taxa than for the well-studied groups of organisms that Joppa et al. (2011) included in their analysis. In addition, the massive numbers of unnamed species in the Genbank and BoLD databases suggest that there is another layer of undiscovered diversity remaining to be characterized (coined “dark taxa” by Rod Page). This is certainly relevant for algae as these unnamed species (e.g. “Rhodymenia sp. 1SA“) are discovered en masse when DNA barcodes are generated and “dark algal species” are accumulating rapidly in Genbank (see figure below; >75% dark taxa in the three main algal groups in 2011). The great majority of these discovered species remain without a proper name because formally describing them is much more laborious than discovering them.

algal dark taxa

Yves and Oli argue that this widening gap between the number of discovered and described species is problematic, focusing their argument on the fact that these newly discovered species do not have names. They argue that scientific names matter for society, for example because legislation (e.g. CITES) uses species names as currency.

While I agree with most of the paper, in particular the part about promoting an increasing role for developing countries in characterizing their biodiversity, I think that Yves and Oli fail to make a convincing case for their “no name, no game” statement. In my opinion, traditional binomials are not needed for legislation to work or for scientists and non-specialists to communicate about species. When the bird flu hit, the specialist as well as the greater audience knew and understood what H5N1 was. Just like professional and amateur astronomers have no trouble communicating about “55 Cancri e”. What would make biologists different? All one needs to communicate about a species is some sort of identifier, not necessarily a formally described species binomial.

When it comes to legislation and conservation, I agree that it is important to be able to pinpoint exactly what is being conserved. But once again, does it need a binomial? Not having to go through the process of describing a newly discovered species would permit that species to be conserved more rapidly. Furthermore, for legislative purposes, diagnosability of the species should be more important than the name of the species. And at least for algae, where DNA data have become the gold standard for species delimitation, DNA sequences are rapidly becoming much more reliable for species identification than morphological keys to named species. While the DNA vs. morphology contraposition should not play a major role in this discussion, it is relevant because the great majority of dark taxa are discovered through DNA sequencing and can future collections can easily be identified as the dark taxon in question with a DNA barcode. In other words, DNA sequencing has changed the game, and because of that I think we should think more along the lines of “no name, new game” instead of “no name, no game”.

Once again, I agree with what Yves and Oli wrote about the taxonomic workforce not being large enough to describe the remaining pool of species in understudied groups within a reasonable timeframe using traditional procedures. As do I agree with most other points made in the paper. But do we really need formal species binomials for all newly discovered taxa? Are there arguments that support the “no name, no game” statement that I have overlooked here? Or arguments in favor of the “no name, new game” alternative that I have not mentioned? I welcome your ideas in the comments.

Most of the projects that my coworkers and I work on involve analyses of big datasets with information about algal specimens. One of Tom Schils‘ projects that I’m helping out with aims to sketch an image of the geographical patterns of seaweed diversity using a combination of tools. Tom’s been accumulating floristic information that we are now trying to complement with DNA sequence data to characterize how species diversity and phylogenetic diversity are distributed on earth.

The Hawaiian Algal Database is a superb resource of information about — you guessed it — Hawaiian algae. The data were generated, compiled and put online by Alison Sherwood and Gernot Presting of the University of Hawaii at Manoa, and a report about the dataset was published in BMC Plant Biology. It’s a specimen-centered database that has all sorts of metadata including geographical coordinates, information about the collection site, and in many cases DNA sequences of up to 3 markers from different genomes (yes, algae have 3, at least).

Because the data are available only through the online HADB interface, Tom encouraged me to write a script to download the information we needed to integrate the Hawaiian data with ours. I wrote a Perl script that uses the LWP library to download and store the information in a more analysis-friendly format. In case anyone is interested, I’m linking the script here.

I downloaded information for the 221 specimens of brown algae, 238 of green algae and 2163 of red algae in the dataset. What’s absolutely great is that for the reds, 61% of the samples have been sequenced; that’s 1333 sequenced specimens belonging to 213 unambiguously named species! Unfortunately far fewer specimens of greens (43) and browns (25) were sequenced.

Stats for Hawaiian Algal Database specimens

So, HADB is a great addition to the data we have from Genbank and other sources and will no doubt help us understand the geographical distribution of algal phylogenetic diversity. Thanks to the Hawaii group for generating these data and making them available.