SemiBin icon indicating copy to clipboard operation
SemiBin copied to clipboard

Question: Coverage not calculated for single_easy_bin with less than 5 .bam files

Open Sebastien-Raguideau opened this issue 1 year ago • 2 comments

Hello,

I saw in your code that when using single_easy_bin, you condition using coverage to having 5 or more bam files.

So, just curious about it. Is there a reason why coverage is not used when less than 5 samples and using a unique assembly?

Do I need to use the multi_easy_bin if I want to use coverage anyhow and is multi_easy_bin going to work with a unique assembly? Do I want to use coverage information if less than 5 samples?

Best, Seb

Sebastien-Raguideau avatar Jul 24 '24 16:07 Sebastien-Raguideau

Coverage is always used, it is just processed differently.

luispedro avatar Jul 25 '24 01:07 luispedro

Hi,

Thanks for the swift answer.

I spent more time reading the code. I think I understand that coverage is not being used for training on the must-link part. For theses, data_split consist of only kmer_split rather than the combined data including coverage for the split contigs.

So, I suppose I just would like to know more about this. Is using this information when the number of sample is inferior to 5, detrimental to semibin results?

Best, Seb

Sebastien-Raguideau avatar Jul 25 '24 10:07 Sebastien-Raguideau