Blacklist icon indicating copy to clipboard operation
Blacklist copied to clipboard

Blacklisted regions sometimes very large: disable 20k merge?

Open ckuenne opened this issue 3 years ago • 2 comments

hi,

i have checked your mouse mm10 blacklists against some of my atac datasets and found that the regions excluded are sometimes really large. much larger than the actual regions of low mappability/high signal. i think this is due to your combination of the hit regions inside a 20k window. this drops quite a few genes that are completely fine.

would it be possible to release a version of the v2 blacklists that does not combine neighbouring hits like this? just the "real" hits?

also right now you classify regions that are both, "High Signal" and "Low Mappability" as "High Signal". Can you completely separate those?

i think it would make sense to release a much more granular version of this. since merging flanking or intersecting regions is a one-liner using bedtools that runs in seconds and can be done by the user as necessary. but downloading the data and running your tool will take weeks for just a single organism.

best, carsten

ckuenne avatar Mar 17 '21 16:03 ckuenne

Yes, this makes sense - I will compute these and share them in the repo.

aboyle avatar Jul 23 '21 16:07 aboyle

thx!

ckuenne avatar Jul 25 '21 08:07 ckuenne