chromap
chromap copied to clipboard
barcode length limit?
Hi!
chromap seems limit barcode length <32bp.
My barcode length is 32bp, when run with "--read-format bc:0:30", everything is good, but when run with "--read-format bc:0:31",all the lines in result file is like this:
chr10 10742 10987 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 1 chr10 10748 10978 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 1 chr10 11428 11549 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 2
seems all the barcode sequence is polyA.
If add "-barcode-whitelist xxx.txt",report an error information like this: "chromap: src/chromap.cc:372: void chromap::Chromap::LoadBarcodeWhitelist(): Assertion `khash_return_code != -1 && khash_return_code != 0' failed."
but when cut barcode len in the xxx.txt file to 31bp, everything is good.
Can chromap update the max barcode length to 48 or 64bp?
Thanks Tao
The current barcode length limit is supposed to be 32 bp. However, I just tested and found there is a bug with 32 bp barcode processing. So for now Chromap only works fine when barcode length is <32 bp. I will provide a quick fix for this issue to support 32 bp barcodes. Supporting barcode longer than 32 bp would take much more time to work on.
The issue was fixed in the latest master branch of Chromap. You can try it now if you want. It will be in the next release.
I am very interested in accepting barcodes longer than 32bp! Is there a way I can help modify the code and add a pull request?
Thank you :)
I am very interested in accepting barcodes longer than 32bp! Is there a way I can help modify the code and add a pull request?
Thank you :)
I appreciate it. But currently the barcodes are encoded as 64-bit integers. So it would require quite some work to make barcodes longer than 32 since all the barcode processing code would need changing. I don't have the bandwidth to make such a change. And the change would also make Chromap slower than what it is right now.
If you really has one special case that you want to handle barcodes longer than 32bp, you may want to create some customized scripts to process them. It seems not quite worth making lots of code changes to support some rare case.