I’ve just uploaded new versions of Maxent Model Surveyor and MatrixGradients to my website.

Two minor changes have been introduced in version 1.07 of Maxent Model Surveyor. First, users can now specify the amount of memory that Maxent can use with the -jm flag. Second, I’ve updated the parser of the Maxent options file, in which Maxent flags can be specified. The parser now prints out a warning if the user tries to change options for which MMS doesn’t allow user control. In addition, it will terminate the program if the user attempts to manually toggle a predictor or a species. More information is provided on the website about what to do if you want to exclude predictors a priori and how to specify flags in the Maxent options file (I’ve included an example file). Thanks to Mark Andersen for bringing these issues to my attention.

MatrixGradients is a perl script that draws colored matrices in which the colors correspond to the values in the matrix. It is now in version 1.02, which includes the option to transform values in the matrix to be plotted. This is useful if many values in the matrix are close to the maximum or minimum value and you want to exaggerate the color differences in that part of the values range. This is illustrated below for a matrix in which many values are close to zero, but with a few values that are considerably higher (up to 22.4). If plotted without transformation (left panel), the entire matrix is green, with just one red value, i.e. not very informative. If a log10 transformation is applied to the same matrix, a much clearer picture of what happens in the near-zero values emerges (right panel) simply because the color gradient is compressed near zero. The new version can also print the values in the matrix as shown in the figure.

MatrixGradients transformation

I’ve just uploaded new versions of OccurrenceThinner and RasterTools on my web site.

OccurrenceThinner is a tool that performs distance-based thinning of species occurrence data to reduce geographic sampling bias in niche modeling. It takes a set of species occurrence records and a kernel density grid file as input. It then filters out occurrence records using a probability-based procedure. More information is available on the software website. The new version 1.04 fixes a problem reading the header of certain ASCII files. Thanks to Diego Nieto-Lugilde for pointing out the problem. The option to round coordinates to a user-specified number of decimals is no longer included for reasons described below.

Within RasterTools, two minor updates were done.

The script moveCoordinatesToClosestDataPixel.jar was updated to version 1.03. The main update here is that this version includes the possibility to specify a distance threshold for moving coordinates. Many thanks to Niels Raes, who suggested this on the Maxent forum. In addition, it fixes the same issue mentioned above and it does not permit rounding the coordinates to a specified number of decimals anymore. In some cases, the rounding caused coordinates to move into no-data pixels, which is exactly the opposite of what this script is supposed to do. Thanks to Vanessa Marcelino for pointing out this problem.

The script extractDataForCoordinates.jar was updated to version 1.03, fixing the same issue with ASCII headers.

In the context of Vanessa Marcelino’s Msc thesis on the evolutionary dynamics of Halimeda seaweeds we were trying to visualize niche conservatism or the lack thereof across a whole bunch of species simultaneously.

As it turns out, making a heat map of niche model similarities gave a very satisfactory result. We calculated two niche overlap measures (Schoener’s D and Warren’s I) for the species’ Maxent models using ENM Tools. The resulting matrices were merged into one with Schoener’s D in the top right triangle and Warren’s I in the lower left triangle. Species were sorted in the matrix in the order that they appeared in the phylogeny, so related species are closer together in the matrix. The similarity values were then converted to colors along a green-yellow-red color gradient with MatrixGradients. The result looks like this (follow this link for PDF version):

heat map

Looks very cool. Well, mostly red and hot actually… The first striking pattern is that D values are on average lower than I values. The fact that almost the entire figure is red indicates that niches are highly conserved in Halimeda. There is also a green-yellowish cross going across the figure (as a horizontal and a vertical band of dissimilar niches). This band represents the species that have invaded colder water and these are related to one another, so they lie together in the matrix. The first column/row is also very dissimilar to all the rest. This is the sole Mediterranean species, which occurs in much colder water than all other species in the genus.

Here are the legend and the way the table was built, just to complete the picture:


I’ve just posted a new version of Maxent Model Surveyor on my web site (link).

Maxent Model Surveyor is a program that evaluates different sets of predictors and different model complexities for Maxent niche modeling. It automatically calculates the test AUC and the Akaike and Bayesian information criteria (AIC, BIC; Warren & Seifert 2011) under the various predictor sets and model complexities and suggests “suitable” sets of predictors and model complexities for your dataset.

Version 1.04 includes the option to specify a custom test dataset when exploring models based on test AUC. I’ve included this option because we wanted to identify a suitable set of predictors that would not bias analyses towards one or the other ocean basin (Atlantic vs. Indo-Pacific). We have many species that occur in one of both ocean basins and when comparing models between strictly Atlantic and strictly Indo-Pacific species, environmental differences between ocean basins could in theory bias the comparison. To avoid this, one could look for predictor sets that have good predictive power across ocean basins for species that do occur in both ocean basins and avoid those predictor sets that don’t.

In closing I want to mention that I’ve renamed this program to Maxent Model Surveyor (instead of the previous Maxent Model Selector) because “surveying” is a more appropriate description of what it does and I don’t want to encourage people to simply let the program “select” a predictor set and model complexity. Programs like this are no substitute for a good understanding of your organism’s physiology and serve as a guiding tool only.

Most of the projects that my coworkers and I work on involve analyses of big datasets with information about algal specimens. One of Tom Schils‘ projects that I’m helping out with aims to sketch an image of the geographical patterns of seaweed diversity using a combination of tools. Tom’s been accumulating floristic information that we are now trying to complement with DNA sequence data to characterize how species diversity and phylogenetic diversity are distributed on earth.

The Hawaiian Algal Database is a superb resource of information about — you guessed it — Hawaiian algae. The data were generated, compiled and put online by Alison Sherwood and Gernot Presting of the University of Hawaii at Manoa, and a report about the dataset was published in BMC Plant Biology. It’s a specimen-centered database that has all sorts of metadata including geographical coordinates, information about the collection site, and in many cases DNA sequences of up to 3 markers from different genomes (yes, algae have 3, at least).

Because the data are available only through the online HADB interface, Tom encouraged me to write a script to download the information we needed to integrate the Hawaiian data with ours. I wrote a Perl script that uses the LWP library to download and store the information in a more analysis-friendly format. In case anyone is interested, I’m linking the script here.

I downloaded information for the 221 specimens of brown algae, 238 of green algae and 2163 of red algae in the dataset. What’s absolutely great is that for the reds, 61% of the samples have been sequenced; that’s 1333 sequenced specimens belonging to 213 unambiguously named species! Unfortunately far fewer specimens of greens (43) and browns (25) were sequenced.

Stats for Hawaiian Algal Database specimens

So, HADB is a great addition to the data we have from Genbank and other sources and will no doubt help us understand the geographical distribution of algal phylogenetic diversity. Thanks to the Hawaii group for generating these data and making them available.