vsearch icon indicating copy to clipboard operation
vsearch copied to clipboard

New option to output sequence length

Open frederic-mahe opened this issue 8 years ago • 1 comments

This is low priority, as it can be easily done with a few shell commands.

An option to output sequence length (in nucleotides) would be useful. It could be name --lengthout, on the same model than --sizeout, --eeout. It could be inserted in fasta/fastq headers as such: ;length=243[;].

frederic-mahe avatar Feb 18 '16 16:02 frederic-mahe

And an option --xlength to remove the length annotation if present.

frederic-mahe avatar Dec 17 '17 15:12 frederic-mahe

Added the --lengthout and --xlength options in commit 0d34660.

torognes avatar Feb 22 '23 17:02 torognes

It took just 7 years...

torognes avatar Feb 22 '23 17:02 torognes

No problem, I am here for the long run. Thanks a lot for that new feature!

frederic-mahe avatar Feb 23 '23 09:02 frederic-mahe

tests added https://github.com/frederic-mahe/vsearch-tests/commit/8b6f8ce07a15e342085da117e8db923f00a6fd49

Without looking at the code, and given the fact that --xlength and --lengthout can be used at the same time, I am right to assume that --xlength acts on the input and --lengthouton the output?

frederic-mahe avatar Feb 23 '23 10:02 frederic-mahe

Yes, --xlength will remove any "length=123" attributes from the input, while --lengthout will add it to the output. This applies to both FASTA and FASTQ files.

I haven't added any documentation yet. Will do.

torognes avatar Feb 23 '23 11:02 torognes

I've updated the manpage to indicate that --xlength acts on input and can be combined with --lengthout (see https://github.com/torognes/vsearch/commit/53b94e527b684072877a153f6475f23c0db973ba)

Finally closing that issue old issue :-)

frederic-mahe avatar Dec 08 '23 17:12 frederic-mahe

issue fully tested in our test suite (see https://github.com/frederic-mahe/vsearch-tests/commit/8b6f8ce07a15e342085da117e8db923f00a6fd49)

frederic-mahe avatar Dec 08 '23 17:12 frederic-mahe