Six bacterial species under family Microbacteriaceae are prepared as a sample dataset.
Species | Accession | Link |
---|---|---|
Clavibacter nebraskensis NCPPB 2531 | GCA_000355695.1 |
NCBI |
Clavibacter insidiosus LMG 3663 | GCA_002240565.1 |
NCBI |
Microbacterium hominis NBRC 15708 | GCA_001592125.1 |
NCBI |
Microbacterium aurum KACC 15219 | GCA_001974985.1 |
NCBI |
Leucobacter chironomi DSM 19883 | GCA_000421845.1 |
NCBI |
Leucobacter muris DSM 101948 | GCA_004028235.1 |
NCBI |
Press the button below to download a compressed file containing the genome assemblies and their metadata of the species above.
Download samples (5.1 MB) EzAAI_samples.zip
Launch EzAAI extract module on your genomes by entering the following command on your terminal.
$ ezaai extract -i fasta -o db -l labels.tsv
EzAAI will automatically produce a CDS profile DBs of the genomes with Prodigal with following prompt.
EzAAI |: Database extraction completed: sample/db/Cn.db
EzAAI |: ...
EzAAI |: Task finished.
You can check the database files lying in the directory by entering the following command.
$ ls db/
Ci.db Cn.db Lc.db Lm.db Ma.db Mh.db
Enter the following to perform all-by-all pairwise AAI calculation on the extracted profiles from above.
$ ezaai calculate -i db/ -j db/ -o out/aai.tsv
The pipeline will automatically detect .db files from the directory and calculate AAI values across the entire set of pairs using MMSeqs2.
EzAAI |: Calculating AAI... [Task 1/36]
EzAAI |: Calculating AAI... [Task 2/36]
...
EzAAI |: Calculating AAI... [Task 36/36]
EzAAI |: Task finished.
Run following to peek the contents of the result file.
$ head -7 out/aai.tsv
ID1 ID2 Label1 Label2 AAI
786958951 786958951 Leucobacter muris Leucobacter muris 100.000000
786958951 199206886 Leucobacter muris Clavibacter nebraskensis 61.609688
786958951 334056981 Leucobacter muris Microbacterium hominis 61.465138
786958951 204122518 Leucobacter muris Microbacterium aurum 61.842079
786958951 1073644442 Leucobacter muris Clavibacter insidiosus 61.453637
786958951 727401181 Leucobacter muris Leucobacter chironomi 88.755422
You can see the result in glance of which the pair from same genus reports relatively high AAI value than the others.
Run following to perform hierarchical clustering on the matrix provided from the previous step.
$ ezaai cluster -i out/aai.tsv -o out/sample.nwk
EzAAI |: AAI matrix identified. Running hierarchical clustering with UPGMA method...
EzAAI |: Task finished.
Resulting file is in a Newick format, which you can either look at it as a text,
$ cat out/sample.nwk
(((Microbacterium hominis:9.325725,Microbacterium aurum:9.325725):9.500001,
(Clavibacter nebraskensis:2.373246,Clavibacter insidiosus:2.373246):16.452480):0.335034,
(Leucobacter muris:5.622289,Leucobacter chironomi:5.622289):13.538471);
or as a tree visualized with different external programs such as MEGA.