Program for phylogenetic study of joint evolution of species and genes
The program embed3GL
is intended for solving four phylogenetic problems
on the basis of original algorithm [1–4] of polynomial (cubic) complexity.
Common source data for the first three problems include:
- rooted species tree, initially a binary one, which is then provided with additional nodes to divide the tree into contemporary slices so that all leaves (extant species) are in the same (deepest) slice. The number af additional nodes on an edge is specified as a “length” of that edge: if the length equals 1 or omitted, no nodes are added; otherwise, the length e.g. 2 indicates that one node is added on that edge, etc. The tree must contain an outgroup, a “species” named Out. The independent program (see below link) can be applied for the species tree time-slicing and insertion the outgroup.
- a set of rooted gene trees, which may not contain polytomous nodes in the current version.
Detailed structure of input/output data is described in the manual.
Problem 1 involves, for each gene tree, the cost computation of this tree embedding into the species tree. The cost value is provided for each gene tree of the input set as well as total over the set. A side effect of the Problem 1 solution is the binarization (binary resolution) of a gene tree if it contains polytomous nodes (not implemented in current version).
Problem 2 is solved on the basis of working data obtained from Problem 1. As a result, for each gene tree, the evolutionary scenario of its embedding into the species tree is built. The scenario is shaped as a tree of evolutionary events that contains both unary and binary edges.
Problem 3 can be solved after binary resolution of input gene trees. Additional user-specified data are required for this problem, namely:
- I-type — a fixed set of types of evolutionary events (e.g. loss, gain, duplication, transfer, etc.); and
- T-type — a set of gene tree nodes, whose all descendant leaves are specially labeled in one or multiple gene trees (e.g. “a set of ancestors of ribosomal genes”).
The Problem 3 output are two functions in tabular form: f(I,x) — average number of I type events in the tube (=edge) x of the species tree; and g(I,T) — average number of I type events occurring for the edges of type T.
Starting from the current version, embed3GL
also provides for solving
Problem 4, which is the building of a supertree that amalgamates given binary trees.
Instead of gene trees, here embed3GL
uses a set of basis trees built with the
program Basis3GL. This method of the supertree
building is more precise than the algorithm implemented in Super3GL, but it is significantly
slower. This is why we recommend to run embed3GL
on a high performance cluster
if Problem 4 is to be solved.
The program embed3GL
is written in C/C++ and has a command line interface.
The program supports parallelization if an MPI 1.2 (or above) environment is available.
The program is portable and can be compiled for Windows 32/64-bit, Linux, Unix, MacOS.
Windows executables (32/64-bit, non-MPI/MPI versions) and the source code for Linux can be
freely downloaded from below links. The source code is available free of charge under the GNU
General Public License (GPL) version 3.
Downloadable files
Version without MPI | Version for MPICH2 1.4.1p | |
---|---|---|
embed3GL executables for Windows 32bit |
1.1.7 | 1.1.7 |
embed3GL executables for Windows 64bit |
1.1.7 | 1.1.7 |
embed3GL user's manual (PDF) |
embed3gl_en | |
embed3GL source code for Linux - GNU GPL V3 |
1.1.7 | |
Windows executable for time-slicing of a species tree | time_slices |
References
- V.A. Lyubetsky, L.I. Rubanov, L.Yu. Rusin, K.Yu. Gorbunov. Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios. Biology Direct, 2012, Vol. 7, Art. 48. DOI: 10.1186/1745-6150-7-48
- K.Yu. Gorbunov, V.A. Lyubetsky. Reconstructing the evolution of genes along the species tree. Molecular Biology, 2009, Vol. 43, No. 5, P. 881–893. DOI: 10.1134/S0026893309050197
- K.Yu. Gorbunov, V.A. Lyubetsky. An algorithm of reconciliation of gene and species trees and inferring gene duplications, losses and horizontal transfers. Information Processes, 2010, Vol. 10, No. 2, P. 140–144, in Russian. EDN: OXAFBL
- K.Yu. Gorbunov, V.A. Lyubetsky. The tree nearest on average to a given set of trees. Problems of Information Transmission, 2011, Vol. 47, No. 3, P. 274–288. DOI: 10.1134/S0032946011030069