nextclade icon indicating copy to clipboard operation
nextclade copied to clipboard

feat/better seed debug logging

Open corneliusroemer opened this issue 1 year ago • 1 comments

  • feat: log basic seed and align stats at debug level

We're currently not really using debug level logging for anything. I think it'd be useful to output some basic stats for each sequence like what proportion of the query are covered by chained matches, mean band width, max indel, and things like that.

This is what this PR outputs at debug level (-vv):

2023-08-30 13:17:52.012 [D] compression.rs:68: When processing '"nextclade.tsv"': detected file extension 'tsv'. It will be using algorithm: 'None'
2023-08-30 13:17:52.012 [I] nextclade_loop.rs:87: Processing sequence 'Ireland/CO-NVRL-ecM21IRL00199074/2021'
2023-08-30 13:17:52.036 [D] seed_match2.rs:509: Chained seed stats. max indel: 12, # matches: 181, first/last match distance from start/end [start: (ref: 12, qry: 0), end: (ref: 129, qry: 0)], max unmatched stretch (ref: 173, qry stretch: 173)
2023-08-30 13:17:52.036 [D] seed_match2.rs:514: Seed alignment covers 87.09% of query length
2023-08-30 13:17:52.040 [D] align.rs:83: Nucleotide alignment band area=24370141, mean band width=814
2023-08-30 13:17:57.535 [D] align.rs:91: Attempt: 0, Alignment score: 86887

corneliusroemer avatar Aug 30 '23 13:08 corneliusroemer

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
nextclade ✅ Ready (Inspect) Visit Preview Aug 30, 2023 2:20pm

vercel[bot] avatar Aug 30 '23 13:08 vercel[bot]