mob-suite icon indicating copy to clipboard operation
mob-suite copied to clipboard

how to get the fasta file for each rep_cluster that annotated froim mob_typer

Open Wanli-HE opened this issue 1 year ago • 1 comments

Hi!

here is the rep_type annotated from mob_typer, the id like in below: 'Col(BS512)', 'Col(KPHS6)', 'Col(MG828)', 'Col(MG828),rep_cluster_2392', 'Col(MP18)', 'Col(VCM04)', 'Col(Ye4449)', 'Col156', 'Col156,Col156', 'Col156,IncFIB', ... 'rep_cluster_816', 'rep_cluster_850', 'rep_cluster_870', 'rep_cluster_889', 'rep_cluster_893', 'rep_cluster_910', 'rep_cluster_943', 'rep_cluster_974', 'rep_cluster_980', 'rep_cluster_992'

I am wondering how can I get the fasta file of each rep_cluster, which means each rep_cluster sequence.

Thanks!

best, wanli

Wanli-HE avatar Jul 27 '22 08:07 Wanli-HE

Hello, the sequences are located in the rep.dna.fas file that is located inside your database directory where the mob-suite is installed (for example, /usr/local/lib/python3.9/dist-packages/mob_suite-3.0.3-py3.9.egg/mob_suite/databases).

You can also get databases by downloading the archive from https://zenodo.org/record/3786915/files/data.tar.gz?download=1

For example the sequence for Col(BS512) and rep_cluster_816

>NC_010656|Col(BS512)
ATGAATGCGGCGTTTAAGCGAATGGAAAAGCGAAAGGAGCTATCACCTGTTCAGGGGTGGATCAGGGCTACGGAGGTGACGCGAGGTAAGGATGGCAGCGCACATCCGCATTTTCACTGTCTGCTGATGGTGCAACCTTCTTGGTTTAAAGGGAAGAACTACGTTAAGCACGAACGTTGGGTAGAACTCTGGCGCGATTGCTTGCGGGTGAACTATGAGCCGAATATCGATAT
>002299__NC_021722_00001|rep_cluster_816
ATGATGACACATTCAAAGCACAAATTCACTTTTATTGAAAAATCTTCTGCGTATCAAAAAAAATACTTCCAATTTCCACAAGTTTTGCTATACGGAGAAAAATATAAGTCCCTTAGCGATAGTGCCAAAATTGCCTATATGGTTCTTCAAAGCAGGCTCGACTACTCGTTAAAAAACAATTGGATTGATGAATCAAATCATGTGTATTTCATTTTTACAAACCAAGAGCTGAAATCGCTAATGCATTGGTCAAACGATAAACTTCGTAAGGTTAAATCAGATCTCATAAATGCAAATTTACTGTATCAAGAAGTAGTCGGGTTTAATCCTAAAACGGGAAAAAATGAGCCAAATCGGCTATATTTATCCGAACTGGATGTTAGTGCAACTGATGTTTATCTCAAGGCTTTTGAACCTAATGAAGACGTAAAAACCCATACACAGTACGGGAAACCGAAAATCGGTCGCCCGCAAGAGACCGTTCAAACTACCGAAAACAGCGGGAAACCGAAAATCGGTCGCCCGCGACATAAGAACTCAAGTGAAGCCGGACCCCTTGAAAATAGCGGGAAACCGAAAATCGGTCACGATCTATATAAGACTTTAGATACAAATACTAGAGACAATAAAGAGACAGAAAAACTGGACTTTTCCACAAATCGATATTCACCTGAGATCATTAAAAAGCAAAATCAAGATCTCGTAAAAAATGCCAGAAACTATCTGCCTGAATCAACAACAGGTGGCCTCTTTCTCAACAAAGAAGGCGTTGAACTGCTAGGCCTTTGGTGCCGCTCACCTAAACAATTGCATCGGTTCCTCGGCATTATCCTAAATGCCAAAAAGGCTGTAGAAAGGGAACATGAAGGAACGGCGATTGTACTTGACGATCCGCTATGCCAAGAAATGATAAACAAGACCATGCGCCGTTTTTTCAATATTCTGCGCTCTGACAGTAAAAAAATTAACAATGTTGAAAATTACTTGTTTGGTGCTATGAAAGAAACATTGGTGGCATACTGGAATAAGACACTGACAACTGCTAACAGAGGTGATCCTAATGAGCTCTAA

kbessonov1984 avatar Jul 28 '22 16:07 kbessonov1984