From metabuli to phyloseq
Good morning, I was wondering if anyone had tried bracken on metabuli report outputs and it that worked. And then from there has it been possible to then convert the outputs into a biom format to later get a phyloseq objects for downstream analysis, like what we would do with kraken-biom for instance? I am just new to this tools and would like to know what is the usual best practice around the outputs from classification to getting something like a phyloseq object for downstream analysis in R.
I put this on my todo list! It seems like a feature that many users would need. I'll leave a comment here when I have something ready to test :)
Thanks! Jaebeom
Sounds great!
Hi @jaebeom-kim
We developed a strategy to convert Kaiju and Kraken2 reports into phyloseq objects and tables. The approach also allows combining multiple reports into a single BIOM object that can then be manipulated in R with phyloseq. Although it is still a prototype, we decided to call this conversion tool Kukulkan. We believe it could be especially helpful for users with limited bioinformatics experience, since if adapted it would make it much easier to handle MetaBuli reports and perform downstream analyses directly in RStudio.
It would be fantastic for many users if MetaBuli could also provide this functionality, and we hope that the code we developed might serve as a starting point to generate a similar feature within MetaBuli, given that it is already such a powerful tool for taxonomic classification.
I also wanted to ask if you are planning to include the complete ICTV virus database as a pre-built reference, since that would be a great addition for many users.
I think this could be useful, and if you’re interested, please let me know the best way to share the script along with its explanation so you can adapt or refine it further.
Saludos from México
Hi @Enkabloza ! Thank you so much for the suggestion, and I would appreciate it if you can share the script. I'm not familiar with phyloseq analysis, so your script would help me a lot. And I hope I also can help you at some point.
Regarding the ICTV, I have read that NCBI taxonomy is updated to follow ICTV's taxonomy. So, I'm planning to build a new RefSeq prokaryote+virus database using the updated NCBI taxonomy. Could it be something you mean by "the complete ICTV virus database"?
Hope to hear more from you
감사합니다 from Korea:)
@jaebeom-kim I truly appreciate your work with Metabuli. One of my hopes when creating this script was that it might inspire integration of direct BIOM/Phyloseq-compatible outputs into Metabuli, since this would be a huge help for many researchers who don’t have the time or coding background to restructure classification reports into usable tables.
Kukulkan repository --> https://github.com/Enkabloza/kukulkan-
We created it to support both Kaiju and Kraken, aiming to simplify the workflow for users who want ready-to-use BIOM and abundance tables. Hopefully, this approach will also inspire tools like Metabuli to integrate a similar function, since it would be extremely useful for many researchers who struggle with downstream formatting.
For Kraken users, there is also an existing tool called kraken-biom (by Shareef Dabdoub), which converts Kraken outputs into BIOM format:
https://github.com/smdabdoub/kraken-biom
Saludos amigo