Cornelius Roemer

Results 501 issues of Cornelius Roemer

### Context Our metadata.tsv is now so big that it becomes impossible to load into memory in full. We should either create an "essential" metadata.tsv or in general partition metadata.tsv...

enhancement

### Context Non-urgent code review items from https://github.com/nextstrain/ncov-ingest/pull/372 - [ ] Use Nextclade from docker container rather than manually downloaded one - [ ] GENES_SPACE_DELIMITED trick is possibly no longer...

enhancement

We currently output only composite clade names, e.g. `21L (Omicron)` Nextclade now produces also atomic clades, that are Nextstrain and WHO only: `21L` and `Omicron`. Nextclade will at some point...

It'd be good as sanity check to have clade counts output after every ingest run. This command would be enough: ``` zcat metadata.tsv.gz | tsv-summarize -H --group-by Nextstrain_clade --count ```...

enhancement

Right now we seem to exclude sequences from the B.1 build that lack a month, i.e. year-only sequences 2022-XX-XX They get filtered out in subsampling as they don't find neatly...

enhancement

### Context Thought I take notes on what needs to be taken into account when moving a workflow to a folder. These are all the relevant commits I could find...

Right now we seem to label sequences as reverse-complemented in ingest, but we don't actually fix the wrong orientation in the output sequences. It would make downstream processing easier if...

enhancement

### Context It'd be great if one could see length directly from the metadata.tsv ### Description We currently don't have it in the metadata, it's easy to compute based on...

enhancement