SqueezeMeta
SqueezeMeta copied to clipboard
Is it possible to process/analyse MetaEuk results with SqueezeMeta?
Hi I am working on a dataset containing mostly Eukaryota and used the MetaEuk pipeline to predict introns and exons. I wonder if it is possible to use use Squeezemeta for mapping of reads, calculating abundance and visualising the results. I understand that this would mean skipping the assembly and the binning and (probably) providing the MetaEuk results as the assembly. Has someone tried this and would this work?
Thanks!
Hi!
It is possible and easy, but there is a caveat.
Assuming that you already have an assembly with the contigs you want to analyze, you can just add the following when calling SqueezeMeta.
-a your_assembly.fasta --nobins
This will skip assembly, and use the provided fasta file instead. This will also skip binning.
The caveat is that we use Prodigal for ORF prediction, and this will not work so great with eukaryotic sequences. You can add the -d
flag when running SqueezeMeta, this will improve annotation over the regions not predicted by Prodigal.
In theory, you could also run another ORF predictor and use that to override the results (gff, fna and faa files) that would have been produced normally when running prodigal SqueezeMeta. If done correctly, you should be then able to restart the pipeline from step 4 and the pipeline should run to completion. However, we have never tried it before, so there could be extra problems along the way.
In any case, if you only want to estimate the abundance of your contigs, annotation wouldn't matter so much.
I made a mistake in the command I recommended above. It would be
-extassembly your_assembly.fasta --nobins -d
Thanks! I will give it a try. MetaEuk outputs fasta and gff files for the predicted genes, so there is no need to run prodigal - however, I know that many pieplines struggle with the intron/exon information.
Will -extassembly your_assembly.fasta --nobins -d
still run prodigal or is it possible to skip that and go straight to mapping and abundance estimation?
You can not skip prodigal but it shouldn't affect the abundance estimation step
You also have available the sqm_mapper.pl script, that maps reads to a reference a performs abundance estimation. Probably is better for your purposes. Best, J
Closing due to lack of activity, feel free to reopen