Tag Archives: XML

Sampling tip dates — this site is deprecated: goto https://www.beast2.org/ for up-to-date information

9 June 2015 by Remco Bouckaert

To sample the height of leaf nodes, you need to do the following:

  1. Set up a calibration on the tip you want to sample.
  2. Add an operator for scaling the tip.
  3. Add an entry to the logger if you want to log the leaf height

Tip calibration in BEAUti

To set up a calibration, the easiest way to do this is by adding a calibration in BEAUti: in the priors panel, hit the little plus (‘+’) button at the bottom of the screen, then specify the leaf you want to sample and give it a unique name. After hitting the OK button, open the details of the prior by pressing the little triangle next to the taxonset (here Homo_sapiens.prior) and a screen shows up like this:

Make sure the Tipsonly box is checked. If you have multiple tips with the same calibration you can put all of these in the same taxonset. With the tipsonly-flag set, the calibration will be applied to the leafs instead of the most recent common ancestor of the set of tips.

Tip calibration in XML

You can also use add an MRCAPrior to the XML inside the distribution element with id=”prior” like so:


  
    
  
  

Make sure taxon id’s are unique: it is possible a taxon with the id of the tip you want to sample is already specified elsewhere in the XML. If so, when starting BEAST, you will get an error saying something like

Error 104 parsing the xml input file

IDs should be unique. Duplicate id 'Homo_sapiens' found

identifying the id that was already specified.

Also, you want to point to the right tree specified by tree="@Tree.t:tree" in the fragement above.

Tip sample operator in XML

Once the calibration is set up, for each tip you want to sample add an operator to the XML like so:


and edit it as follows:

  • Make sure the id is unique, just changhing the number will do that.
  • The taxonset attribute should refer to the correct taonset.
  • Check that the tree attribute points to the tree you want to sample from. It should be the same tree as in the MRCAPrior.

Add logger entry

To log the leaf height in the trace log, so you can see its mean height, as well as check how well it mixes, add an entry referring to the MRCAPrior to the tracelog. Just place a log entry inside the logger with id=”tracelog” like so:


That’s all.

All about starting trees — this site is deprecated: goto https://www.beast2.org/ for up-to-date information

28 July 2014 by Remco Bouckaert

Before starting an MCMC run, BEAST uses State Node initialisers to set up the starting trees (and other parameters). Often a good starting tree is already available from another analysis (e.g. a maximum likelihood tree). Though it is against the spirit of MCMC, which guarantees any random starting state will lead to convergence, a good starting tree can help with large analysis to speed up convergence quite a bit. Of course another reason to specify a starting tree is if you want to do an analysis using just a single known tree topology.

We start with standard analysis, but see below for *BEAST analyses.

Standard analysis

For a standard analysis, there are three ways to specify a starting tree

  • a random tree
  • a tree in Newick format
  • a cluster tree, e.g. through UPGMA or neighbour joining

By default, BEAUti uses a RandomTree state-node initialiser to generate a starting tree. If you want to replace it with a Newick tree or cluster tree, first you need to remove (or comment out) the RandomTree element from the XML. For a tree in partition XYZ26, just search for RandomTree and remove the XML fragment with id RandomTree.t:XYZ26. It should look something like this:


    
        1.0
    

Before removing, make note of the initial attribute (@Tree.t:XYZ26> here) and taxa attribute (@XYZ26 here).

Newick starting tree

After removing the Random tree, in the same place, add a TreeParser using the following fragment


The attributes that you want to specify are

  • id="NewickTree.t:XYZ26" the ID should be unique, so something like NewickTree.t: plus the name of the tree would be suitable.
  • initial="@Tree.t:XYZ26" this refers to the tree being initialised. It should be the same as the initial attribute of the RandomTree that was removed.
  • taxa="@XYZ26" this refers to the alignment and ensures that taxa in the tree are lined with those in the alignment. For BEAST v2.1.x, this must be specified to prevent starting with a mislabelled tree
  • newick="((your,(tree,goes)),here)" obviously needs to be replaced with your own tree in Newick format.

This is all you need to know to set up a starting tree. There are a few optional attributes you can use to make life a bit easier:

  • adjustTipHeights is true by default, which means tips of the tree will be set to zero, or if tip dates are specified to these particular tip dates. By setting to adjustTipHeights="false", tips will be initialised by the heights in the Newick tree.
  • if adjustTipHeights="false" then threshold specifies threshold under which node heights (derived from lengths) are set to zero. This helps when there are numeric issues with adding the lengths.
  • scale scale used to multiply internal node heights during parsing. Useful for importing starting from external programs, for instance, RaxML tree rooted using Path-o-gen.
  • IsLabelledNewick="true" Is the Newick tree labelled (alternatively contains node numbers)?
  • If sLabelledNewick="false" then “offset="1" is the lowest taxa number. The default=1 but 0 is common as well. Taxa numbers are as they are ordered in the alignment referred to with the taxa attribute.

Cluster tree

After removing the Random tree, in the same place, add a ClusterTree with the following fragment


The attributes that you need to specify are id='UPGMATree.t:XYZ26', initial="@Tree.t:XYZ26" and taxa="@XYZ26", which are as for Newick tree (see above).

You can use different clustering algorithms by specifying the clusterType="upgma" attribute.

clusterType Description
single single link
complete complete link
upgma or average UPGMA=average link
mean mean link
centroid centroid
ward Ward’s method
adjcomplete adjusted complete link
neighborjoining neighborjoining
neighborjoining2 neighborjoining2 – corrects tree for tip data, unlike plain neighborjoining

*BEAST analysis

For a *BEAST analysis, the default in BEAUti is to generate a UPGMA tree for the species tree, then generate UPGMA trees for each of the gene trees and fit them inside the species tree. In previous incarnations of BEAST, a random tree was generated for species as well as for gene trees, but gene tree branches ending in leaves were lengthened such that each first coalescence of a gene tree was above the root of the species tree. Though this ensures that the gene trees fit inside the species tree, it leads to long burn-in times.

If you want to use a different starting tree for the species tree, first move (not remove!) the StarBeastStartState element generated by BEAUti to just before the run-element with id="mcmc". BEAUti will have generated some elements inside the start-state (such as the species-tree-prior) that we need elsewhere in the analysis. By moving the init element outside the run element, it will not activate the StarBeastStartState.

The StarBeastStartState looks something like this:


    
    
    

For every gene tree there will be a tree entry in the StarBeastStartState. To replace the species tree with a Newick tree, add the following fragment (at the place where the StarBeastStartState was):


The only thing to set up is the Newick using the newick attribute — all other attributes should be as they are (it is using the species tree, which is identified by the Species partition in every *BEAST analysis generated by BEAUti).

For each of the gene trees, you need to specify a tree as well. You can specify them using a Newick tree as for the Standard analysis (see above). Alternatively, you can use a random tree that is placed ‘above’ the species tree. To do this, you add for the first gene tree (just below the new Species tree start tree):


  

where you replace the initial='@Tree.t:26' and taxa="@26" attributes to match your tree (instead of 26, as in this example). For every subsequent gene tree, you add (below the first gene start tree):


where you replace the initial='@Tree.t:29' and taxa="@29" attributes to match your tree (instead of 29) and make sure the id is unique.