bigwig output
it will be useful to have bigwig output from mosdepth. this is my tentative plan; feedback by users would be appreciated.
mosdepth will remain unchanged, except that when per-base output is requested, it will output bigwig instead of bed.gz--there will be no option to request bigwig and it will not be written for quantized/regions/thresholds.
quantized/regions/thresholds can be quickly derived from bigwig using bigwig or pybigwig. I will be building out bigwig to support all operations such that a user can run mosdepth once to get per-base bigwig output and then quickly get quantized/regions/thresholds from the bigwig.
mosdepth will still support reporting quantized/regions/thresholds directly, but the idea is that once these can be pull from the per-base bigwig nearly instantly, there's less need to report them.
if users foresee any problems with removing per-base.bed.gz in favor of per-base.bed.bw, please let me know.
I would find sending the data to stdout much more useful as this then easily allows to convert to bigwig if one really needs this (e.g. bg2bw from cancerit) as well as perform custom manipulations on the data such as normalization by e.g. a given size factor. Stdout in combination with a feature to extend read to a user-defined fragment length would really help as this then could replace bamCoverage from deeptools while being much faster.
thanks for the feedback. (I didn't know about bg2bw). the per-base bed is so large that parsing it takes as much CPU time as generating it. so even if I sent BED to stdout, it would be inefficient. parsing bw will be much faster.
can you expand on
a feature to extend read to a user-defined fragment length
? I'm not sure what that means?
Read extension to the average fragment length similar to what -fs and/or -pc in bedtools genomecov does.
I can see why one could want -pc, but why use -fs ?
We have no issues with the plan. Looking forward to testing out the bigwig functionality!
this is fairly easy to implement after developing bigwig-nim, but it substantially increases the memory use. I am trying to figure out how to mitigate that. meanwhile, it's possible to use the binary from bigwig-nim to convert the per-base.bed.gz from mosdepth to a bigwig. (and then use the bigwig-nim binary to perform summary operations on that as well.