filter: Prefer "output sequences" over "output"?
augur filter allows --output, --output-sequences, and -o to be used interchangeably:
https://github.com/nextstrain/augur/blob/da1c89d4b3232aa977b69fb8df33d666532c9a56/augur/filter/init.py#L105
The order here means that it must be internally referenced as args.output, where output is the default value of dest.
"output" is ambiguous since this is just one of many output options. I would prefer the more specific name to align with other options and subcommands.
Two layers to this proposal:
-
Prefer "output sequences" over "output" internally.
- Use
dest='output_sequences'andargs.output_sequences.
- Use
-
Prefer "output sequences" over "output" for users.
- Reorder the options to
'--output-sequences', '--output', '-o'so that the preferred name is displayed first. This would remove the need for an explicitdest. - A bigger change would be deprecating the
--output/-oflags and removing in a major release, but maybe that's not necessary and would just be extra churn.
- Reorder the options to
Thanks for documenting this so clearly, @victorlin. I'm definitely in favor of preferring --output-sequences for users and eventually deprecating --output.
It might be worth considering doing the same in augur index. Current usage:
usage: augur index [-h] --sequences SEQUENCES --output OUTPUT [--verbose]
Count occurrence of bases in a set of sequences.
options:
-h, --help show this help message and exit
--sequences SEQUENCES, -s SEQUENCES
sequences in FASTA or VCF formats. Augur will summarize the content of FASTA sequences and only report the names of strains found in a given VCF. (default: None)
--output OUTPUT, -o OUTPUT
tab-delimited file containing the number of bases per sequence in the given file. Output columns include strain, length, and counts for A, C, G, T, N, other valid IUPAC characters, ambiguous characters ('?' and '-'), and other invalid
characters. (default: None)
--verbose, -v print index statistics to stdout (default: False)
There, --output-sequences needs to be added first.