peptide-shaker icon indicating copy to clipboard operation
peptide-shaker copied to clipboard

Reports: adding multiple proteins to comma separated file

Open tivdnbos opened this issue 4 years ago • 3 comments

When multiple proteins are added to a report, a comma is used. This occurs even when the user prefers csv format instead of tab. Therefore, no major problems here: the user can use the default tab separated format instead of comma separated, but I thought you should be aware of this. This problem might also occur for other features, but I haven't tested this.

tivdnbos avatar Jan 31 '21 10:01 tivdnbos

Good point! Any suggestions for what we ought to use instead of comma? As we still want the text to be easy to read.

hbarsnes avatar Jan 31 '21 20:01 hbarsnes

Sorry for my late reply, Harald.

I think the most convenient way is to not provide csv files, but only the default tsv files so no parsing errors can occur. Perhaps, a semicolon might be of use if we want to keep the csv format? I don't think semicolons occur often in protein names.

Best, Tim

tivdnbos avatar Feb 15 '21 16:02 tivdnbos

Hi Tim,

Now that you mentioned it, aren't all our default text exports tab based, i.e. tsv files and not csv files? Which means that this issue only occurs if a user chooses to use comma separated columns and selecting export content such as protein groups where comma is also used to separate the column content? So the easy fix is simply to not create those types of reports I guess? ;)

I'm leaning against not removing csv as an output format though, as there might be simpler user-defined exports that do not include comma-separated column content.

And from a general point of view, there is really no symbol that can be considered as "safe" when it comes to protein names and accession numbers, as I do not think that we can guarantee that there will be no protein names or accession numbers containing comma or semi colon, for example.

Best regards, Harald

hbarsnes avatar Feb 16 '21 14:02 hbarsnes