Rarefaction icon indicating copy to clipboard operation
Rarefaction copied to clipboard

Skip samples only as necessary

Open handibles opened this issue 3 years ago • 1 comments

Greets Devs,

Thanks as ever for the tk. If my samples vary around a median depth of 50,000, and I rarefy along seq(1,000, 50,000, 10,000), RTK drops, at all rarefaction depths, all samples that cannot be rarefied to the largest value, even if the sample has enough reads to be rarefied at the lower specified depths. In the above case of rarefying to the median, it would drop half of all samples from all stages/steps of the rarefaction.

Possibly it would make more sense to only drop samples as necessary? e.g. from the above example, drop a sample of 45K reads only at the final rarefaction step of 50,000, and not drop it for 11k, 21K, 31K, 41K as in the current implementation.

I think this might even be the intent as a warning is issued at each rarefaction depth to notify the user, rather than just once.

Interested to hear if this is by design etc.

handibles avatar Jun 29 '21 13:06 handibles

Yeah. Without having looked at it this sounds like a bug/unexpected behaviour. I suspect you use the R-package? We might have implemented in that way to have a consistent table returned. But I agree this might not be what one expects to happen.

This should be fixable, likely with an optional flag.

For now you can probably reach the expected behaviour by running each rarefaction individually.

openpaul avatar Jun 29 '21 13:06 openpaul