nextclade
                                
                                
                                
                                    nextclade copied to clipboard
                            
                            
                            
                        nextclade run --output-csv is semicolon separated, not comma
--output-csv seems to be producing ;-sep not ,-sep
The --output-tsv is fine.
% cd dattaset-dir
% nextclade run --input-fasta seqiences.fasta ---dataset-dir . --output-csv out.csv
% head out.csv
seqName;clade;Nextclade_pango;qc.overallScore;qc.overallStatus;totalSubstitutions;totalDeletions;totalInsertions;totalFrameShifts;totalAminoacidSubstitutions;totalAminoacidDeletions;totalAmi <snip>
                                    
                                    
                                    
                                
Thanks for opening this issue.
This is on purpose: https://github.com/nextstrain/nextclade/blob/4f236b58277bdaef72e2f84d26207dbbbc0b8502/packages/web/src/state/algorithm/algorithmExport.sagas.ts#L38
I don't know what the reasoning was, but there's one. @ivan-aksamentov can tell probably
We should probably document this better, although it's easy to notice by inspection.
@tseemann This is by design. Have you encountered any particular problems with that?
The problem is that it is not documented in the --help and the option is called --output-csv :-)
A note int he --help text next to the option would resolve the confusion.
I think we did this because many fields contain lists, which themselves are comma separated. "..." of course solves this, but using a different delimiter seemed more robust. Should definitely documented though.
Has this behaviour been changed in Nexclade 2.x ?
@tseemann No, at least not intentionally. What have you found?
P.S. Added a note in https://github.com/nextstrain/nextclade/pull/933
It is still ; separated, despite being called --output-csv :-)
Best to leave it for backward compat but maybe put a note in the --help for it?
( I will just keep using --output-tsv and converting it with csvtk tab2csv )
When you convert from tsv to csv you need to quote surround certain columns that contain , - otherwise it becomes unparseable.
We could have an option to output a comma separated CSV, but that would require aforementioned quote surrounding.
Can you elucidate the use case in which ; is unacceptable as separator? Is it to match expectations? Or is there a technical reason?
All decent software to open CSV should allow specification of the separator. After all, German csvs will use ; as separators anyways. We could basically say we produce German CSV ;)
@tseemann --output-csv output uses ; for a long time now, since very early days of Nextclade 0.x, when someone requested it. In Nextstrain we traditionally use mostly TSV tables everywhere, but someone felt they need a CSV, and requested a feature, so we added it. At the time, due to defects in the early Nextclade implementation, it was considered difficult to implement comma-separated rows, because commas were already used in the values, so we went with a simple solution of using semicolons. The person requested it was fine with this. And so modern Nextclade had to inherit all that.
Most spreadsheet software (e.g. MS Office, Open Office etc.) and libraries (e.g. Pandas, etc.) should automatically recognize semicolon delimiter, or at least there is often an option to switch to it, and, in my experience, it is quite widespread in the files on the internet. In all places I've seen it's always called CSV, even if the delimiter is not comma. No one calls it SSV or anything other than CSV.
So we consider this normal, and right now there is no intention to change it - this would be an unnecessary breaking change.
I mentioned above that in PR https://github.com/nextstrain/nextclade/pull/933 I added a note to the help message text. It was released in 2.3.0.
Let us know if you encounter any additional problems or have concrete improvement ideas in mind.