Divergence dating with sampled ancestors (FBD model)

BEAST2 version

To use divergence dating download the latest BEAST release (v2.2.0 or higher) here

Sampled ancestor package

Then install the sampled-ancestor package using BEAUti 2 package manager. Open BEAUti 2 and in the file menu choose “Manage packages”. Then install SA package. See Managing Packages for more details. If you want to use the skyline variant of the fossilised birth-death model then you also need to install BDSKY package.

Setting up the analysis in BEAUti

The set up of the divergence dating analysis using sampled ancestor models is similar to setting up a regular divergence dating analysis.

It differs in two ways:

First you need a nexus file which includes all fossil taxa with empty sequences. It is also helpful to include ages for fossil and extant taxa in taxon names otherwise you will need to specify the age for each taxon manually. An example nexus file bears.nex goes with the package. Upload the nexus file to BEAUti. Manage linking trees and clock models if there are partitions in your analysis. There are no partitions in the example. Then switch to the Tip dates panel and tick “Use tip dates” box. Now specify the ages for fossil and extant taxa. The ages are usually measured as a distance from the present so you might need to choose “Before the present” instead “Since some time in the past”. If the age is a part of a taxon name then push “Guess” button and pick the right pattern. For the example nexus file, we choose “use everything after last _”. See also general notes for more details on using “Guess” option.

Further proceed with the site and clock model specification as usual. We do not change the default site model but choose the uncorrelated log-normal relaxed clock model.

The second main difference is the tree prior distribution. Choose the fossilised birth-death model (or the skyline variant) from the list of possible tree priors. This model has parameters: origin, diversification rate, turnover rate, sampling proportion and rho parameter. In the default settings, rho parameter is fixed to one. It is important to fix at least one parameter or place an informative prior on it because there is no comparative data for fossil taxa. For more information on this model see the paper.

Further you may also place monophyletic constraints on groups of taxa. To do this hit the plus button on the Priors panel, give a name to the taxon set and pick the taxa that have to be in the taxon set. For our example we choose all taxa from Canidae species: Canis lupus, Hesperocyon gregarius, Caedocyon tedfordi, Osbornodon sesnoni, Cormocyon copei, Borophagus diversidens. After you push the ok button the prior for this taxon set appears on the panel. Tick the monophyletic box next to it and leave none as a prior. See Constraints Of Monophyly. You might choose a prior distribution for the most recent common ancestor of this taxon set. In this case, it is particularly important to sample from the prior before running the analysis to see the resulting tree prior distribution.

Adjust the chain length and save the file.

The resulting file bears.xml goes with the package.

It is also possible to specify age ranges for fossil taxa. Currently, one can do this manually adding operators to the resulting xml file. An example file bears_ranges.xml goes with the package. In this example file, a range for one taxon is specified (see SampledNodeRandomWalker operator specification with comments at the end of the file).

Analysing the posterior distribution of trees

Troubleshooting

A common exception you might encounter is:

java.lang.Exception: Could not find a proper state to initialise. Perhaps try another seed.

in combination with one of:

P(FBD.t:bears) = -Infinity (was NaN)

P(FBD.t:bears) = Infinity (was NaN)

P(FBD.t:bears) = NaN (was NaN)

This might happen for several reasons. To resolve it, first, check whether all the initial values for parameters are valid, that is, they are within the ranges of corresponding prior distributions and whether the time of origin is older than the ages of all sampled nodes and older than the root if a starting tree is specified by a newick tree.

If all the initial values are valid then try to change initial values of parameters because the tree likelihood function (FBD tree prior) may become infinity (in this case, P(FBD.t:bears) = Infinity (was NaN) or = NaN (was NaN) is reported) or negative infinity due to numerical approximations for some combinations of parameters. Also if a random tree is used it can simulate a tree which is older than the time of origin. In this case try to adjust the population size parameter or make the initial value of the time of origin larger if the prior distribution allows this.

Leave a Reply

Bayesian evolutionary analysis by sampling trees