Lab.6 IITP RAS logo
26/04/24
19:33:18

Laboratory of Mathematic methods and models in bioinformatics,
Institute for Information Transmission Problems,
Russian Academy of Sciences

« back

Example 4 of the Super3GL program use

Download the bunch of Example 4 files: example814.zip

The set of input trees in Example 4 consists of 1511 gene trees (file all_trees.tre) for 820 Bacteria species. A leaf label in the gene tree contains abbreviated species name as the first segment (i.e., before underscore sign). Consequent segments of the label are the genome number and gene name. The table of full species name is given in the file BacNames.csv. Six of 820 species in the table do not occur in the input trees, and two trees represent only one species. The program removes this spurious data. Thus, 1509 trees with 814 species remain.

The program configuration file for running in normal mode (both Stages 1 and 2) is super3GL.ini

Command line to run the program on 512 processors in MVAPICH-1.2 environment:

mpirun -np 512 -maxtime 600 super3GL

Execution time on the supercomputer MVS-100K in the JSCC RAS - 391 min.

The file of basis trees obtained after Stage 1 - basis.tre

The supertree file obtained after Stage 2 - super3.tre
Note that the tree is incomplete: 82 of 814 species could not be inserted unambiguously and were skipped (see the end of execution log).
The same file after decoding of abbreviated species names with UNCODE utility - super3n.tre; it is obtained by the command line:

uncode super3.tre BacNames.csv super3n.tre

The execution log - super3GL.log

Download the bunch of Example 4 files: example814.zip

« back