kraken icon indicating copy to clipboard operation
kraken copied to clipboard

Fixed human genome downloading and added auto-masking feature using dustmasker

Open taltman opened this issue 8 years ago • 3 comments

Hi Derrick,

I've been using Kraken as of late, and decided to incorporate some fixes. First, I incorporated a suggested fix by Silas Kieser for downloading the human genome, and simplified it to use a single wget call:

https://groups.google.com/d/msg/kraken-users/wMNMSPo8Xtw/osYcrx90DgAJ

Next, I incorporated Adam Rivers' suggestion about how to use dustmasker to mask low-complexity regions into kraken-build, along with adding an option '--no-mask' to turn off masking if desired for reproducibility. The software reverts to no masking if dustmasker is not found.

https://groups.google.com/d/msg/kraken-users/jjRe21-qyvw/Kq8DXY45CQAJ

I also updated the documentation to reflect the new soft dependency on dustmasker, and documented the --no-mask option.

Please let me know what you think!

Cheers,

~Tomer

taltman avatar Feb 22 '17 06:02 taltman

Nice! Thanks Tomer - we use dustmasker ourselves in our pipeline, but we hadn't built it into Kraken as an option. I highly recommend 'dust'-ing any genomes before running Kraken (or any competing program) because of the confusion caused by low-complexity sequences.

salzberg avatar Feb 23 '17 00:02 salzberg

Thanks for your feedback, Dr. Salzberg! As I've communicated to Derrick, Kraken was instrumental in me finishing my PhD, so I'm happy to be able to contribute back. Please let me know if you think that one of the Kraken devs will accept this pull request.

I'm curious about your thoughts on masking before or after mapping. I see that Heng Li advises masking after read mapping, and discarding reads that land in masked regions. Is this what you advise with Bowtie2, or do you mask DNA before building a Bowtie2 database? Thanks in advance!

https://www.biostars.org/p/170435/#170450

taltman avatar Feb 23 '17 03:02 taltman

@SheaML Right you are! I did have a commit locally that I failed to push. Resolved. Thanks for catching that and reporting it!

taltman avatar Mar 10 '17 15:03 taltman