Eric T. Dawson

Results 31 issues of Eric T. Dawson

As noted by Titus in [an issue](https://github.com/marbl/Mash/issues/27), Minhash sketches should be stable and cross-compatible between programs. It makes sense to fall into line with this convention. To this end, rkmh...

enhancement

Writing this down here so I can remember it one day, when I've submitted my thesis... Right now we have an API for hashing kmers (`calc_hash(**)`) and reads (`calc_hashes(**)`). However,...

I've made a lot of improvements to the [pliib](https://github.com/edawson/pliib), which holds a lot of the utility functions for strings used in mkmh. At some point, I'd like to refactor mkmh...

MurmurHash, while relatively fast / dispersive / backwards compatible with Mash, is slower than some newer algorithms. Moving to xxHash should yield a ~2X speed improvement in the hashing portions...

Feature Request

Because we use a reference genome for input, we can easily generate rGFA. This also gets around the many issues of representing paths in GFA 1 and GFA 2. We...

Long vcf lines (such as those with the SV sequence in the ref/alt fields) silently fail parsing. This is standard get_line behavior but is a major issue. Temporarily, I've fixed...

bug

I have repeatedly run into this issue so it's high time I write it down. Variants in a vcf with adjacent positions will cause a segfault, probably because of some...

bug
enhancement

The GFA output of svaha2 is currently valid GFA 1. However, minigraph requires GFA overlaps in its output. This means the current output of svaha2 is invalid as minigraph input,...

enhancement

This PR moves from GFA paths to the more stream-friendly Walk (W) line syntax. This should significantly reduce memory usage when outputting path information.