sourmash icon indicating copy to clipboard operation
sourmash copied to clipboard

add some rust developer docs!

Open ctb opened this issue 1 year ago • 10 comments

how to run the tests right: https://github.com/sourmash-bio/sourmash/pull/3138#issuecomment-2104746508

ctb avatar May 10 '24 16:05 ctb

syntax for specifying a specific branch when simultaneously working on sourmash-rs and external modules (e.g. a plugin that uses the rs interface) - from #3193 / https://github.com/sourmash-bio/sourmash_plugin_branchwater/pull/348

sourmash = { git = "https://github.com/sourmash-bio/sourmash.git", branch = "debug_rust_gather", features = ["branchwater"] }

And each time you push to the new sourmash branch, you'll need to run:

cargo update -p sourmash

in the dependent repository to get it to recognize the updated dependency.

ctb avatar Jun 08 '24 13:06 ctb

this let me get tracing output:

diff --git a/src/core/Cargo.toml b/src/core/Cargo.toml
index 91b76d62..7dfd9ffa 100644
--- a/src/core/Cargo.toml
+++ b/src/core/Cargo.toml
@@ -33,6 +33,7 @@ cfg-if = "1.0"
 counter = "0.5.7"
 csv = "1.3.0"
 enum_dispatch = "0.3.13"
+env_logger = "0.11.3"
 finch = { version = "0.6.0", optional = true }
 fixedbitset = "0.4.0"
 getrandom = { version = "0.2", features = ["js"] }
diff --git a/src/core/src/index/revindex/mod.rs b/src/core/src/index/revindex/mod.rs
index 0e3e443b..4b3c5bde 100644
--- a/src/core/src/index/revindex/mod.rs
+++ b/src/core/src/index/revindex/mod.rs
@@ -12,6 +12,7 @@ use getset::{Getters, Setters};
 use nohash_hasher::BuildNoHashHasher;
 use roaring::RoaringBitmap;
 use serde::{Deserialize, Serialize};
+use env_logger::try_init;
 
 use crate::collection::CollectionSet;
 use crate::encodings::{Color, Colors, Idx};
@@ -591,6 +592,8 @@ mod test {
 
     #[test]
     fn revindex_load_and_gather_2() -> Result<()> {
+        let _ = env_logger::try_init();
+
         let mut basedir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
         basedir.push("../../tests/test-data/gather/");

command to run w/trace output:

RUST_BACKTRACE=full RUST_LOG=trace cargo test --features branchwater index::revindex::test::revindex_load_and_gather_2

ctb avatar Jun 08 '24 14:06 ctb

also: core release procedure! https://github.com/sourmash-bio/sourmash/issues/2987

ctb avatar Jun 10 '24 17:06 ctb

Since the tests in sourmash_plugin_branchwater are run with pytest and not cargo, how can you get the Rust backtrace for the branchwater code like https://github.com/sourmash-bio/sourmash/issues/3150#issuecomment-2156053900?

olgabot avatar Jun 13 '24 13:06 olgabot

When developing https://github.com/sourmash-bio/sourmash_plugin_branchwater/pull/354, I learned that to get the Rust code to first recompile with maturin develop (and give me compiler error messages!) before running the tests, for example, choosing only the test index_multiple_manifests with verbose mode (-v) and the Python debugger on (--pdb):

maturin develop && pytest -v -k index_multiple_manifests --pdb

olgabot avatar Jun 13 '24 13:06 olgabot

Compile documentation for the project and all dependencies with cargo doc --open, which will open an HTML page with all documentation, as described in the Rust Book

olgabot avatar Jun 21 '24 15:06 olgabot

To get the rust-analyzer VS Code extension working with an Anaconda installation of Rust, I needed to add to my .vscode/settings.json:

    "rust-analyzer.server.extraEnv": {
        "CARGO": "/Users/olgabot/anaconda3/envs/branchwater-dev/bin/cargo",
        "RUSTC": "/Users/olgabot/anaconda3/envs/branchwater-dev/bin/rustc",
    },

Hoping to save others some headache in the future!

olgabot avatar Jun 28 '24 14:06 olgabot

Over in #3249, I asked:

how do you feel about adding anyhow to sourmash-rs core?

and on slack, luiz responded:

re anyhow: I usually follow this advice https://github.com/dtolnay/anyhow?tab=readme-ov-file#comparison-to-thiserror

Use Anyhow if you don't care what error type your functions return, you just want it to be easy. This is common in application code. Use thiserror if you are a library that wants to design your own dedicated error type(s) so that on failures the caller gets exactly the information that you choose.

somewhat same with env_logger: it's for applications, not libs. tho if you're looking for it in tests you can init it there (and make it a dev-dependency)


note that thiserror is indeed part of sourmash already :)

(transferring this here because I was keeping #3249 open only because of this comment 😆 )

ctb avatar Aug 20 '24 13:08 ctb

hmm, maturin import hook => automatically runs maturin develop when code changes are present in rust code.

Once the hook is active, any import statement that imports an editable-installed maturin project will be automatically rebuilt if necessary before it is imported.

ctb avatar Aug 31 '24 13:08 ctb

hmm, maturin import hook => automatically runs maturin develop when code changes are present in rust code.

Once the hook is active, any import statement that imports an editable-installed maturin project will be automatically rebuilt if necessary before it is imported.

Related to this, it seems to me that re-building librocksdb-sys takes up a substantial amount (50%+??) of the build time. Is it possible to only re-build the sourmash part and not the dependencies with maturin develop? Adding --skip-install didn't help.

Building [=======================> ] 261/266: librocksdb-sys(build)  

Her's an example build, which is a bit delayed because of a lock on the build due to rust-analyzer running in my VS Code, but still shows that librocksdb-sys takes forever!

And all that I changed for this build was adding a consuming variable:

        frequencies
            .values_mut()
            .map(| freq| 
                freq.ln()
            );

to

        let _ = frequencies
            .values_mut()
            .map(| freq| 
                freq.ln()
            );

How does that trigger a full rebuild, especially of librocks-db?

Here's a gif of the build, which took 5 minutes from 16:00 UTC to 16:05 UTC: (screencast was resized from 15MB to 10MB to be under the 10MB limit for file uploads to GitHub, sorry if it looks weird)

2024-10-01maturindeveloprebuildseverythingrocksdbtakesforever-ezgif com-resize

Other PRs complain about this behavior (https://github.com/PyO3/maturin/issues/504, https://github.com/rerun-io/rerun/pull/3474), and say that updating maturin to 1.3.1 fixes things, but I'm on 1.7.4... so I'm not sure how to fix things. Any thoughts?

(branchwater)
 ✘  Tue  1 Oct - 15:54  ~/sourmash_plugin_branchwater   origin ☊ olgabot/multisearch-evalue ↑1 1● 
 ec2-user@x86_64-conda-linux-gnu  maturin --version                                     
maturin 1.7.4

olgabot avatar Oct 01 '24 16:10 olgabot