John Huddleston

Results 94 issues of John Huddleston

## Context Prior to the SARS-CoV-2 pandemic, we assumed that Nextstrain workflows would begin with pre-curated sequences and metadata sourced either from our internal database ("fauna") or custom scripts maintained...

enhancement

## Description of proposed changes Instead of [dutifully writing all sequences for strains that pass given filters](https://discussion.nextstrain.org/t/error-in-augur-tree-duplicated-sequence-name/970/4), only write out the first sequence for each strain from the given input...

needs triage

### Current Behavior The help text for augur export v2 says the following (emphasis mine): > You can supply a config JSON (which has all available options) or command line...

bug

**Context** Augur's current backend for transparently reading/writing compressed data is [xopen](https://github.com/pycompression/xopen/). [xopen uses the standard Python LZMA module to handle xz files](https://github.com/pycompression/xopen/blob/9edf101dfad2f91f7fc5d268c61dc0c49e65597c/src/xopen/__init__.py#L506-L507), but it only uses the `filename` and `mode`...

enhancement

## Current Behavior Reported by @rneher: > `augur distance` fails in silent ways when the `root` sequence is not part of the alignment. This has to do with this defaultdict...

bug
easy problem
please take this issue

**Context** [Since version 1.3.0, xopen supports passing `encoding` and `newline` arguments through to the lower level `open` function calls for each compression method](https://github.com/pycompression/xopen#v130). When we added the `io` module to...

enhancement

## Current Behavior [Users have described running the sanitize metadata script in the ncov workflow with input data that includes large fields](https://discussion.nextstrain.org/t/sanitize-metadata-py-error-error-expected-after/807). These fields exceed Python's default limit, requiring users...

bug

## Current Behavior The `read_metadata` function in the `io` module checks for the presence of valid id columns to index the resulting pandas DataFrame. [This function raises a generic exception...

bug

The sampling bias correction argument for augur traits has a brief description of the parameter, but we do not have any documentation about when to use this argument and how...

documentation

**Solution** Update the docstring for the distance module to include examples of how to ignore specific characters and also describe how indels get treated as single events. **Context** This issue...

documentation