John Huddleston
John Huddleston
I forgot to include context of what I was doing to run into this same issue: I've been developing an updated Nextstrain introductory tutorial to expand the current "Zika tutorial"...
Looking at other Nextclade datasets today, I noticed that most others follow the general GFF3 pattern we'd expect. For example, [the H3N2 HA gene map](https://github.com/nextstrain/nextclade_data/blob/master/data/datasets/flu_h3n2_ha/references/CY163680/versions/2022-06-08T12:00:00Z/files/genemap.gff) looks like this: ```gff3 ##gff-version...
@corneliusroemer, do you have an example dataset that uses these floating point dates in the metadata? @rneher pointed out that the use case for these dates is for analyses of...
> It would be nice if branches were not filtered out by the date range filter. I would like the behaviour to be similar to normal filters. I feel exactly...
I think this issue arose as part of [this Slack conversation](https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1653486825748339?thread_ts=1653420595.121749&cid=C01LCTT7JNN). @corneliusroemer, am I correct in this?
Related to https://github.com/nextstrain/augur/issues/642
Thank you writing this up, @victorlin! For the CLI, we should consider continuing to support separate `--metadata` and `--sequences` arguments vs. adding a `--database` argument. The pandas `read_csv` API that...
Thanks, @joverlee521 and @victorlin! I actually prefer to default to throwing an error when we detect duplicates, since duplicates for a single strain name are almost always going to reflect...
We can definitely build out a new subcommand gradually over time and start with deduplication. [This issue](https://github.com/nextstrain/augur/issues/860) is the best place to start, for reference. We got stuck on the...
@rneher Do you have example data files we could use to write a test for this behavior before we fix the bug?