Eric T. Dawson
Eric T. Dawson
As noted by Titus in [an issue](https://github.com/marbl/Mash/issues/27), Minhash sketches should be stable and cross-compatible between programs. It makes sense to fall into line with this convention. To this end, rkmh...
Writing this down here so I can remember it one day, when I've submitted my thesis... Right now we have an API for hashing kmers (`calc_hash(**)`) and reads (`calc_hashes(**)`). However,...
I've made a lot of improvements to the [pliib](https://github.com/edawson/pliib), which holds a lot of the utility functions for strings used in mkmh. At some point, I'd like to refactor mkmh...
MurmurHash, while relatively fast / dispersive / backwards compatible with Mash, is slower than some newer algorithms. Moving to xxHash should yield a ~2X speed improvement in the hashing portions...
Because we use a reference genome for input, we can easily generate rGFA. This also gets around the many issues of representing paths in GFA 1 and GFA 2. We...
Long vcf lines (such as those with the SV sequence in the ref/alt fields) silently fail parsing. This is standard get_line behavior but is a major issue. Temporarily, I've fixed...
I have repeatedly run into this issue so it's high time I write it down. Variants in a vcf with adjacent positions will cause a segfault, probably because of some...
The GFA output of svaha2 is currently valid GFA 1. However, minigraph requires GFA overlaps in its output. This means the current output of svaha2 is invalid as minigraph input,...
This PR moves from GFA paths to the more stream-friendly Walk (W) line syntax. This should significantly reduce memory usage when outputting path information.