Total-evidence with FBD analysis utilises molecular sequence data of extant species, morphological data of fossil and extant species and fossilisation dates of fossils to infer the phylogeny including divergence times and macroevolutionary parameters.
To run such an analysis in BEAST2 you need to install SA (sampled-ancestor) version 1.1.3 and morph-models version 1.0.2 packages.
The molecular and morphological data should be in two separate NEXUS files. The set of taxa and label should be identical in both files. Thus molecular data file should have “empty” sequences for fossil taxa, e.i., sequences filled with gaps or question marks, for example. There are examples of molecular data NEXUS file and morphological data NEXUS file.
Creating XML in BEAUti
BEAUti supports most of the features of a total-evidence analysis. To create an XML file run BEAUti and hit the plus button at the left bottom corner and choose Add Morphological Data or choose Add Morphological Data from the File menu to load morphological data. BEAUti will automatically partition the data matrix with respect to the number of states for each character. BEAUti determines the number of state for a character according to the description in the CHARSTATELABELS block. If there is no description for a character then BEAUti counts the number of different symbols that occur in the corresponding column. BEAUti will ask you whether to condition on coding only variable characters (Lewis Mkv model). For this example, we do not choose this option because the taxa were pulled randomly from a real data set and constant characters may occur in the morphological data matrix although in most cases you need to condition on coding only variable characters. Next load the molecular alignment as usual (choose Import Alignment).
Now you need to link the trees for all data. To do this select all the partitions and hit Link Trees button. Then rename tree fields for all partitions with one name, say, tree.
Now go to the Tip Dates panel and select Use tip dates. For the example NEXUS files, the dates are parts of the taxon names and we choose dates sepcified as years before present (the dates are actually in million years) and guess dates after last _.
Further, go to the Site Model panel. We choose four categories for Gamma variation model and estimate Shape parameter for the DNA partition. For morphological partitions, we keep the default model which is Lewis Mk with no Gamma variation. Note that a model with Gamma variation across morphological characters with a shared Gamma shape parameter for all partitions would fit this data better, however this setting is not currently available in BEAUti. See below how to add this option to the resulting XML file. We choose Relaxed Clock Log Normal models for both partitions on the Clock Model panel.
Then go to the Priors panel and choose Fossilised Birth Death Model as a tree prior distribution. Now we need to adjust the settings of the tree prior. Hit the arrow button to see available parameters with starting values. We leave the default starting values for all parameters except for Diversification Rate which should be within [0,0.1] interval. So we set 0.05 starting value for the diversification rate. Also for this example only a fraction of extant species is included in the analysis. So Rho parameter should be estimated (select estimate box next to this parameter). On the same panel, we also need to change the priors for the time of origin and diversification rate. Set Uniform(0,150) for Origin and Uniform(0,0.1) for Diversification Rate. We choose LogNormal(-5.5, 2) for the mean of the log-noramally distributed morphological clock rates and LogNormal(-3.5, 1) for molecular clock rates.
Further go to the MCMC panel and rename log files: say, set penguins.log name for tracelog and penguins.tree for treelog. Set Log Every to 10000 for the tree log to obtain reasonable file size. Now you are ready to save the XML specification. You can compare the resulting XML file with an example XML file (with some edits described below) obtained by following this tutorial.
Editing XML to add extra features
After you prepared the XML file you can run it with BEAST. There are a few things that you might want to change manually in this XML. First of all, to speed up the mixing merge two relaxedUpDownOperators to simultaneously scale both mean clock rates up and tree down (see comments in the example XML file).
You may also add age ranges for fossil samples as described here.
To add Gamma variation across morphological characters with shared Gamma shape set Gamma categories to four and estimate Shape parameter for one of the Lewis Mk models when you specify the analysis in BEAUti. If you add Gamma variation to partition two then you need to add attributes:
gammaCategoryCount="4" shape="@gammaShape.s:penguins_morph2"
to site models for remaining morphological partitions and remove existing shape parameters from these models (see comments in the example XML file).