Is there a way to disable output of alignments/gene extraction?
Hi,
Thanks for the useful tool. I'd like to use nextclade_cli in a pipeline purely to get the *..auspice.json and *.tsv (clade assignment, QC etc.) files. I don't need the other output files, including extraction of genes into individual files, or an alignment. Is there a way to disable these files being output? Or do I just need to unfortunately delete them post-hoc?
I tried reading through the help online and for the version I'm currently using (v1.4.1 via Docker).
Cheers, Charles
Hi @charlesfoster,
No, there is currently no such option.
Historically, these flags were just inherited from Nextalign and users of Nextalign always want the alignment (it's the only thing Nextalign does), plus peptides were added to the mix later on.
@corneliusroemer was also thinking about that. I wonder how we can introduce the ability to skip the sequence and/or peptide outputs without breaking the current interface and without making it confusing. I imagine there are also people who want aligned sequences and not peptides, and vice versa.
One thing to note is that even if files are not emitted, the computational steps to obtain the alignment and peptides still have to be performed, in order for other algorithm steps to be able to run, so these options will be purely cosmetic. Might help users to save some disk space though.
How do you imagine these options to look like in the command-line interface?
Thanks for the fast response. I had wondered about whether the alignment, peptides etc. were needed for the final QC results.
For now, I can add a rule to my workflow to collect and remove the alignments etc. - no problem. Moving forward, if there were to be an added option to the cli, I suppose there might be flags to stop the creation/saving of output fasta files for the alignment and the extracted genes. Something like:
--no-output-alignment Disable output of aligned input sequences
--no-extracted-genes Disable output of extracted protein-coding genes from input sequences
--peptides-only Only output translations of extracted protein-coding genes (no nucleotide sequences)
--nucleotides-only Only output nucleotide sequences of extracted protein-coding genes (no peptide sequences)
Obviously some of these would be mutually exclusive, and the wording might need to be finessed. Not sure how they would work, either. Intermediate files sent to /tmp or /dev/null? Or written to disk then removed?
Cheers, Charles
v2 can do this.