sourmash
sourmash copied to clipboard
Quickly search, compare, and analyze genomic and metagenomic data sets.
per user suggestion via matrix chat. and/or display how to adjust it in search output.
This issue extracts the leftover bits from https://github.com/sourmash-bio/sourmash/issues/1352, mostly to do with evolving the `Storage` class and SBTs to better intersect. From [comment](https://github.com/sourmash-bio/sourmash/issues/1352#issuecomment-797685247) - @luizirber: > Quick comment on supporting...
The `MultiIndex` class is used for four purposes: * loading one or more signatures from JSON via stdin * loading one or more signatures from a JSON file * loading...
After https://github.com/dib-lab/sourmash/pull/1420, we run the risk of silently selecting away large numbers of incompatible signatures. Perhaps we should print this out in the `_load_database` code? See for example `test_search_traverse_incompatible` as...
There's a confusing mess of loading functions in `sourmash_args.py` that's slowly converging as we simplify and refactor our Index handling code. In no particular order, there's **query loading code**, `load_query_signature`....
I don't think it serves a purpose any more; IIRC it was used to indicate that 'number of signatures' was not a useful designation Back When, but that is not...
Use https://github.com/obi1kenobi/cargo-semver-checks-action to verify for breaking changes in the public API for the Rust crate.
I'm having difficulties to download Genbank databases. I was able to download GTDB, and Genbank viral k31. Is there any other place I can find these files? Please suggest how...
The CSV parsing code in `src/sourmash/lca/command_index.py` is just terribly complicated and parts of it (`--no-headers`, for example) are not even tested. Given the ongoing maintenance and expansion of `tax` taxonomy...
When running large databases (e.g. building a new alphabet or ksize for all of gtdb), it would help to have some progress output from `fromfile`, since we have the whole...