CoverM icon indicating copy to clipboard operation
CoverM copied to clipboard

Estimating the abundances of multiple viral genomes

Open asierFernandezP opened this issue 10 months ago • 5 comments

Hi,

I am currently running coverm genome to estimate the abundance of multiple viral genomes in my samples. However I am not sure which is the best way to do this:

  • Is it correct to specify with --genome-fasta-files a single FASTA file with all the viral genomes? Should I split this FASTA into files containing only one viral genome per file? (or these 2 options make no difference at all)

  • Should I use the --reference option instead?

Thank you, Asier

asierFernandezP avatar Apr 06 '24 21:04 asierFernandezP

I think probably easiest to use contig mode instead of genome. The only downside is that you cannot output relative abundance. However that is readily calculated from the ratio of the means, perhaps taking into account the number of reads that map.

wwood avatar Apr 06 '24 22:04 wwood

Thank you for the quick response!

And regarding the output, as I am currently using both --coupled (with paired FASTQs) and --single (with unpaired FASTQ) options, I get 2 columns of abundances (one for the paired files and one for the unpaired). Which would be the best way to combine this into a single column (as I am just interested in getting the total abundance of each contig in my sample - considering both paired and unpaired reads?

asierFernandezP avatar Apr 07 '24 01:04 asierFernandezP

If you are just using the mean output, I think easiest is just to add the results of the two columns. More complicated for other outputs.

wwood avatar Apr 07 '24 05:04 wwood

In this case I am using RPKM

asierFernandezP avatar Apr 07 '24 05:04 asierFernandezP

Hi,

Thank you for the amazing tool that has saved a lot of time in my analysis !!!

I was following this question and I don't fully understand this: "However that is readily calculated from the ratio of the means, perhaps taking into account the number of reads that map."

Does this means?

Total mapped reads 10 out of 100 reads

              mean  reads    %

contig_a 2 3.3 3.3 contig_b 4. 6.7 6.7

I am sorry if this is nonsense

Best,

Johan Sebastián

SebasSaenz avatar Apr 26 '24 12:04 SebasSaenz