mzLib icon indicating copy to clipboard operation
mzLib copied to clipboard

MGF peak list parser expects field separator to be SPACE

Open andzajan opened this issue 3 years ago • 4 comments

Hi, we had issues reading some MGF files with MetaMorpheus.

I did some debugging and problem is that ParsePeakLine function expects fields to be separated by a single white space. In our case values are TAB separated.

Matrix sciences don't specify use of single space, TAB or multiple white spaces in the peak list: http://www.matrixscience.com/help/data_file_help.html

I have tested locally and changing line 169 to something like example below will work with all types of MGF peak lists:

var sArray = line.Split((char[])null, StringSplitOptions.RemoveEmptyEntries);

I didn't want to open pull request because of such a small change, but I hope you will accept my suggestion. All Unit tests in your pipeline are parsing with this change.

andzajan avatar Feb 22 '22 13:02 andzajan

can you provide an mgf file to use in our unit test?

trishorts avatar Feb 22 '22 14:02 trishorts

Small example mgf file: https://drive.google.com/file/d/1o1l2PNBtHTKiybYIrTMhXrVkpOvX_Vhn/view?usp=sharing

andzajan avatar Feb 22 '22 14:02 andzajan

I submitted a pull request that should solve the problem. I need two reviews to get approval for merge. then I will make a PR to MetaMorpheus w/ the update. This may take a couple days. If it's urgent, I may be able to get you a pre-release. https://github.com/smith-chem-wisc/mzLib/pull/611

trishorts avatar Feb 22 '22 14:02 trishorts

Thank you for such a quick turnaround. On our side it's not extremely urgent, as we are using our local build for now.

andzajan avatar Feb 23 '22 08:02 andzajan