Zamin Iqbal
Zamin Iqbal
Also, since this issue was raised, Robyn and I had a conversation which resulted in **Proposal 3** As a one off calculation, sample read-length paths from the PRG, and then...
..where "mask out" could mean modify the original VCF/whatever and regenerate a better PRG
1. I agree about chunking for memory reduction - belongs elsewhere 2. there are methods for finding repeat regions, on linear references. The word "repeat" means many things in this...
The Bob Jenkins hash works fine in Cortex. BooPHF/related use a minimal perfect hash for much bigger sets and are fast
I look forward to chatting about this when I get back next week. However, we should be driven by profiling. I think enumeration of kmers In graph is more of...
Thanks! I will have a go. BTW I wonder if you can get equally good estimates of noise by using a smaller number of SNPs but more sequencing depth (eg...
I'll have a chat with @bingmann about a release and see how he's doing more generally
Thanks so much for the contributions and work @luizirber @johnlees
Should already support amino acid sequences in the sense that it indexes free text. Pass in a .txt file of amino acids and see how you go?
Note that by default it tries to canonicalize kmers (taking kmer and reverse complement and choosing smaller). See this text from the README: With the flag --no-canonicalize any letters or...