FragPipe icon indicating copy to clipboard operation
FragPipe copied to clipboard

PhilosopherFilter fails with `Cannot decode packed binary. expected element type <msms_pipeline_analysis> but have <msms_run_summary>`

Open hguturu opened this issue 2 years ago • 18 comments

Describe the bug I was running the LFQ-phospho workflow on a set of mzML files from Bruker data. After running the whole process during the PhilosopherFilter step one of the job fails with FATA[19:58:57] Cannot decode packed binary. expected element type <msms_pipeline_analysis> but have <msms_run_summary>

This is from the same run as https://github.com/Nesvilab/FragPipe/issues/630 and https://github.com/Nesvilab/FragPipe/issues/631#issuecomment-1086136771 where I had issues with MSBooster and PTMProphet. I actually got the above PhilosopherFilter error previously on Ubuntu after the previously suggested work around (turn off MSBooster, run PTMProphet commands manually to side step the FragPipe failures). I thought maybe me running things manually some how caused the files to not be formatted properly.

Since then I switched from Ubuntu to Windows and ran everything in FragPipe. I didn't have any issues with MSBooster or PTMProphet and it ran all the way through. And several PhilosopherFilter also ran, but then this crashed occurred.

This error is reproducible since if I run below in either Powershell or Git-Bash I get the same error and the exit codes ($LASTEXITCODE, $?) are 1 for both.

C:\FragPipe\fragpipe-jre-17.1\tools\philosopher\philosopher.exe filter --sequential --prot 0.01 --tag rev_ --pepxml E:\LFQ-phospho_windows\EXP0252X21_A --protxml E:\LFQ-phospho_windows\combined.prot.xml --razorbin E:\LFQ-phospho_windows\EXP0252X21_A\.meta\razor.bin

If you're submitting a bug report, please attach log file

PhilosopherFilter [Work dir: E:\LFQ-phospho_windows\EXP0252X21_A]
C:\FragPipe\fragpipe-jre-17.1\tools\philosopher\philosopher.exe filter --sequential --prot 0.01 --tag rev_ --pepxml E:\LFQ-phospho_windows\EXP0252X21_A --protxml E:\LFQ-phospho_windows\combined.prot.xml --razorbin E:\LFQ-phospho_windows\EXP0252X21_A\.meta\razor.bin
Process 'PhilosopherFilter' finished, exit code: 1
Process returned non-zero exit code, stopping

~~~~~~~~~~~~~~~~~~~~
Cancelling 3304 remaining tasks
INFO[19:58:56] Done                                         
INFO[19:58:56] Executing Filter  v4.2.1                     
INFO[19:58:57] Fetching razor assignment from: E:\LFQ-phospho_windows\EXP0252X21_A\.meta\razor.bin: 83203 razor groups imported. 
INFO[19:58:57] Processing peptide identification files      
INFO[19:58:57] Parsing interact-EXP0252X21_A.mod.pep.xml 
FATA[19:58:57] Cannot decode packed binary. expected element type <msms_pipeline_analysis> but have <msms_run_summary> 

hguturu avatar Apr 15 '22 16:04 hguturu

Can you send the whole log file?

Thanks,

Fengchao

fcyu avatar Apr 15 '22 16:04 fcyu

Here you go. log_2022-04-14_19-58-57.zip

hguturu avatar Apr 15 '22 17:04 hguturu

Hi Felipe @prvst ,

Can you take a look? Everything seems good until Philosophe filter command.

Best,

Fengchao

fcyu avatar Apr 15 '22 17:04 fcyu

I also noticed that some runs have 'chmod' is not recognized as an internal or external command, operable program or batch file. error in PTM-Prophet, but some don't. It might be related to the error in the filter command. Felipe @prvst can you investigate is too?

Thanks,

Fengchao

fcyu avatar Apr 15 '22 17:04 fcyu

It's a malformed pep.xml file. I see phospho in the folder name, so I guess PTMprophet was executed? If so, reduce the number of threads and try again.

prvst avatar Apr 15 '22 17:04 prvst

I think David has already fixed this bug in PTM-Prophet. Can you update Philosopher to include the new PTM-Prophet? We keep receiving error reports due to this bug.

Best,

Fengchao

fcyu avatar Apr 15 '22 17:04 fcyu

@guoci we added the most recent PTProphet to the source, right?

prvst avatar Apr 15 '22 17:04 prvst

I have never updated it.

guoci avatar Apr 15 '22 17:04 guoci

Hi, Yes I was running LFQ-phospho workflow. Is there a recommended number of max cores for stable performance? I was using 127 and noticed close to 100% CPU utilization on my 128 core machine.

I assume this means the task is highly parallelized and getting a benefit from the many cores. Since I have many files, I want to run through the files as fast as possible since even with 127 cores this job took nearly a week.

Is there a check I can do on the resulting pep xml so I can just re-run those bad files manually to avoid recomputing everything?

hguturu avatar Apr 15 '22 17:04 hguturu

Unfortunately, no. This is a known problem on a third-party tool, and there's no "right" number. This happens with people running 10 or 100 threads because it seems to depend on their configuration.

prvst avatar Apr 15 '22 17:04 prvst

Looks like it might even be non-deterministic in the number of threads (i.e. I think I was able to re-run it using the same thread count and it succeeded).

Oddly proteinprophet uses this file as input, but it didn't complain. Does that mean the file is only partially malformed and proteinprophet results are valid? If not, it might be worthwhile having this check in proteinprophet to avoid running protein inference and then failing at the filter/report stage.

hguturu avatar Apr 17 '22 21:04 hguturu

Yeah, multithreaded issues are hard to debug. The files are good regarding their content. It's just the header that is malformed, some XML tags are left open and philosopher needs them to consume the file. ProteinProphet will ignore everything in the file except for the lines it needs.

prvst avatar Apr 18 '22 14:04 prvst

Hi, I may have a similar problem, though I am not using the LFQ-Phospho workflow. I have a large number (>600) of raw files DDA data (Qexactive). I am running FragPipe V17.1, MSfragger V 3.4 and Philosopher v4.1.1. , Python 3.8.8. on a Win10 system w 128GB RAM and 58CPU. The DB search goes fine all along, but Philosopher stops reproducibly with the following error: (Slightly different from the one at the top of this thread). I have PTM-Shepherd off, PSM validation and ProteinProphet is on,

_> PhilosopherFilter [Work dir: D:\Fragger_output\WOW_FT_AllRuns\All_Runs_Fragger\FT_01]

C:\MyPrograms\FragPipe\tools\philosopher\philosopher.exe filter --sequential --mapmods --prot 0.01 --tag rev_ --pepxml D:\Fragger_output\WOW_FT_AllRuns\All_Runs_Fragger\FT_01 --protxml D:\Fragger_output\WOW_FT_AllRuns\All_Runs_Fragger\combined.prot.xml --razor INFO[21:54:22] Executing Filter v4.1.1
INFO[21:54:22] Processing peptide identification files
Process 'PhilosopherFilter' finished, exit code: 1 FATA[21:54:22] Cannot decode packed binary. strconv.ParseFloat: parsing "0,999993": invalid syntax Process returned non-zero exit code, stopping

fragpipe_2022-02-02_17-21-14.config.txt

Cancelling 1453 remaining tasks

log_2022-04-26_17-46-12.txt _

Any clue how to get around this?

ProteomicsPSG avatar Apr 26 '22 15:04 ProteomicsPSG

@ProteomicsPSG that seems to be a different issue. It looks like your computer is using a different standard for encoding float numbers, using commas instead of dots. It's a common problem in computers set to a different language than English, for example.

strconv.ParseFloat: parsing "0,999993": invalid syntax

prvst avatar Apr 26 '22 17:04 prvst

As for : FATA[21:54:22] Cannot decode packed binary. strconv.ParseFloat: parsing "0,999993": invalid syntax That's weird. My PC runs on english settings in Win10. I have the . as decimal separator ( not a comma), at least in the microsoft environment. I am not sure if that is also true under python background. Can you advise me how to check and/or correct this then? Can I by pass the filtering step.. is this related to one of the parameters that I used?

And all my other Fragpipe searches worked fine.. no problems there! So ,it seems either in the large set of data, or in one of the settings I may have an error.

image

I now reduced the sample set to 50 runs. Same settings, Updated Philosopher to V4.2.1. Same error:

Process 'PhilosopherDbAnnotate' finished, exit code: 0 PhilosopherFilter [Work dir: D:\Fragger_output\WOW_FT_AllRuns\Mmx_only\MMx_05] C:\MyPrograms\FragPipe\tools\philosopher\philosopher.exe filter --sequential --mapmods --prot 0.01 --tag rev_ --pepxml D:\Fragger_output\WOW_FT_AllRuns\Mmx_only\MMx_05 --protxml D:\Fragger_output\WOW_FT_AllRuns\Mmx_only\combined.prot.xml --razor Process 'PhilosopherFilter' finished, exit code: 1 INFO[13:36:48] Executing Filter v4.2.1
INFO[13:36:48] Processing peptide identification files
INFO[13:36:48] Parsing interact-MMx_05.pep.xml
FATA[13:36:48] Cannot decode packed binary. strconv.ParseFloat: parsing "0,999933": invalid syntax Process returned non-zero exit code, stopping

ProteomicsPSG avatar Apr 29 '22 09:04 ProteomicsPSG

Can you send me your file?

prvst avatar Apr 29 '22 13:04 prvst

See attached the params, the config and the log file plus a file list for PrtProphet. From the reduced sample set (50 runs).

filelist_proteinprophet.txt fragpipe_2022-04-29_12-15-30.config.txt fragger.params.txt log_2022-04-29_13-36-48.txt

ProteomicsPSG avatar Apr 29 '22 17:04 ProteomicsPSG

Hi Felipe, I have uploaded the log file and settings file to github on this issue thread. Do you need more info, eg. Some raw file?

Actually, I think I have done something wrong in the settings parameters. But don’t know where. Could you check that?

Previous searches with MS fragger didn’t give this error.. but I chose some different settings.

Thanks a lot for your help.

Twan

From: Felipe da Veiga Leprevost @.> Sent: Friday, 29 April 2022 15:46 To: Nesvilab/FragPipe @.> Cc: America, Twan @.>; Mention @.> Subject: Re: [Nesvilab/FragPipe] PhilosopherFilter fails with Cannot decode packed binary. expected element type <msms_pipeline_analysis> but have <msms_run_summary> (Issue #649)

Can you send me your file?

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNesvilab%2FFragPipe%2Fissues%2F649%23issuecomment-1113329999&data=05%7C01%7Ctwan.america%40wur.nl%7C08367ee73cac4a1a316c08da29e68cf5%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637868367437456408%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=B8jiEdlmtf7%2FrDCwoqYkNxsa4q2xwIOC77LeLs%2FqHRU%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAXSSINANUT6ERRMRA4L7CPDVHPRYHANCNFSM5TQZ6HYQ&data=05%7C01%7Ctwan.america%40wur.nl%7C08367ee73cac4a1a316c08da29e68cf5%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637868367437456408%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=odclIyBJsI0uTKcz84kEOIyb6aQsBF6oL8NP1w3ky5I%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.@.>>

ProteomicsPSG avatar Oct 11 '22 07:10 ProteomicsPSG

We have the pre-release versions of FragPipe and MSFragger than fixes the multi-threading issue in PTMProphet.

Please contact us for the pre-release versions if you want to have a try.

Best,

Fengchao

fcyu avatar Oct 27 '22 21:10 fcyu