FragPipe icon indicating copy to clipboard operation
FragPipe copied to clipboard

Philosopher cannot create report files due to unknown reason. It also needs more than 2 TB memory with 280 GB interact.pep.xml files.

Open Wenhhao opened this issue 3 years ago • 17 comments

Hi FragPipe,

I am trying to create a spectral library with DIA-Umpire_SpecLib workflow. While running Philosopher Report, an Cannot write file error occurred. 8d7a63652d6415f2c2bf8c1a2d96b20

Here are some details:

  • Raw data: About 2500 DIA files and 350 DDA files
  • Workflow: DIA_DIA-Umpire_SpecLib_Quant
  • Version: FragPipe 17.1, MSFragger 3.4 and Philosopher 4.1.1

In addition, the DIA-Umpire, MSFragger, PeptideProphet and ProteinProphet were run on one server. But the Filter and Report were run on another server since a 2 TB memory requirement. But I am not supposed it is the cause of this issue. Because similar processes have already been performed normally with less files.

I am sure that the disk space is sufficient. And it does not seem like a permission issue. Is it possible due to too large .bin files or something, since the size of psm.bin is up to 113 GB? Or how could we deal with it? f93020b7200c129bd8484dcc78f4b9b

Best, Wenhao

Wenhhao avatar May 07 '22 17:05 Wenhhao

@Wenhhao try the latest release of Philosopher, which will reduce the file sizes significantly. https://github.com/Nesvilab/philosopher/tags

guoci avatar May 07 '22 17:05 guoci

Thanks @guoci . So is that means I could start from Philosopher Report? And could we get rid of this issue with the lateset release? Hundreds of GB is not that big, but dozens of GB is the most I have got in the runs before.

Wenhhao avatar May 07 '22 17:05 Wenhhao

@Wenhhao You need to rerun all the philosopher steps because the binary files are no longer compatible. For non-TMT data, the new version should only use about 15% of the storage of that version of Philosopher you are currently using.

guoci avatar May 07 '22 17:05 guoci

Hi @guoci ,

But why would the size be an issue? They have enough space and memory. Do you think with the new Philosopher and smaller bin files, this issue will be gone?

Best,

Fengchao

fcyu avatar May 07 '22 17:05 fcyu

@guoci Do you think the Cannot write file error is due to big binary file size? Since I am not sure about it, just my guess.

Wenhhao avatar May 07 '22 17:05 Wenhhao

I am not sure, why not try it out. Or do you need to pay for the compute resources?

guoci avatar May 07 '22 17:05 guoci

I can try it, but it will take more than a month. So it would be better if there were a definite solution.

Wenhhao avatar May 07 '22 18:05 Wenhhao

@Wenhhao did you see any .tsv files generated?

guoci avatar May 07 '22 18:05 guoci

@Wenhhao try the latest release of Philosopher, which will reduce the file sizes significantly. https://github.com/Nesvilab/philosopher/tags Can the latest Philosopher reduce the resources we required, since wenhao mentioned we need 2TB memory in one server which is not easy for us.

XiaoQiiiii avatar May 08 '22 05:05 XiaoQiiiii

May I ask in which command Philosopher needs 2TB memory?

Thanks,

Fengchao

On Sun, May 8, 2022 at 1:15 AM XiaoQi @.***> wrote:

@Wenhhao https://github.com/Wenhhao try the latest release of Philosopher, which will reduce the file sizes significantly. https://github.com/Nesvilab/philosopher/tags Can the latest Philosopher reduce the resources we required, since wenhao mentioned we need 2TB memory in one server which is not easy for us.

— Reply to this email directly, view it on GitHub https://github.com/Nesvilab/FragPipe/issues/669#issuecomment-1120352102, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABU27WZGRRP5EKN6J2EK6GLVI5EXXANCNFSM5VKVCGWA . You are receiving this because you commented.Message ID: @.***>

-- Dr. Fengchao Yu Research Investigator University of Michigan

fcyu avatar May 08 '22 13:05 fcyu

@XiaoQiiiii both the memory and storage will be reduced, and the processing speed will also be improved. @Wenhhao we cannot provide a definite solution. There is no way to prove it will run on your data successfully until you do it.

guoci avatar May 08 '22 13:05 guoci

@Wenhhao did you see any .tsv files generated?

.tsv was not generated.

Wenhhao avatar May 09 '22 03:05 Wenhhao

May I ask in which command Philosopher needs 2TB memory? Thanks, Fengchao On Sun, May 8, 2022 at 1:15 AM XiaoQi @.> wrote: @Wenhhao https://github.com/Wenhhao try the latest release of Philosopher, which will reduce the file sizes significantly. https://github.com/Nesvilab/philosopher/tags Can the latest Philosopher reduce the resources we required, since wenhao mentioned we need 2TB memory in one server which is not easy for us. — Reply to this email directly, view it on GitHub <#669 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABU27WZGRRP5EKN6J2EK6GLVI5EXXANCNFSM5VKVCGWA . You are receiving this because you commented.Message ID: @.> -- Dr. Fengchao Yu Research Investigator University of Michigan

In command Philosopher Filter. Precisely was 1.5-2.0 TB memory.

Wenhhao avatar May 09 '22 03:05 Wenhhao

@guoci @fcyu Thank you very much. We are going to try again use the latest Philosopher.

Besides, I still have a question for the files genetated by DIAumpire. To save time, we plan to use Q1 and Q2 and abandon Q3, do you think it is resonable?

XiaoQiiiii avatar May 09 '22 06:05 XiaoQiiiii

You can use just Q1. This is what I did with large datasets myself in the last Alexey

Get Outlook for iOShttps://aka.ms/o0ukef


From: XiaoQi @.> Sent: Monday, May 9, 2022 2:03:56 AM To: Nesvilab/FragPipe @.> Cc: Subscribed @.***> Subject: Re: [Nesvilab/FragPipe] Philosopher error: cannot create report file (Issue #669)

External Email - Use Caution

@guocihttps://github.com/guoci @fcyuhttps://github.com/fcyu Thank you very much. We are going to try again use the latest Philosopher.

Besides, I still have a question for the files genetated by DIAumpire. To save time, we plan to use Q1 and Q2 and abandon Q3, do you think it is resonable?

— Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/669#issuecomment-1120677829, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM66C2NMX7GLRADVKIPTVJCTEZANCNFSM5VKVCGWA. You are receiving this because you are subscribed to this thread.Message ID: @.***>


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

anesvi avatar May 09 '22 10:05 anesvi

@Wenhhao Could you give me an estimate on how much data you're trying to process? Please sum the size in GBs of all your .pepXML files

prvst avatar May 09 '22 14:05 prvst

@Wenhhao Could you give me an estimate on how much data you're trying to process? Please sum the size in GBs of all your .pepXML files

We have more than 15,000 pep.xml files but only 280 GB in size.

Wenhhao avatar May 10 '22 11:05 Wenhhao

Hi @Wenhhao ,

We are revisiting the memory issue in Philosopher. May I ask if you have tried the latest Philosopher, and if the memory requirement reduced?

Thanks,

Fengchao

fcyu avatar Jan 07 '23 16:01 fcyu

Hi @Wenhhao ,

We are revisiting the memory issue in Philosopher. May I ask if you have tried the latest Philosopher, and if the memory requirement reduced?

Thanks,

Fengchao

Hi @fcyu, We tried Philosopher v4.2.2, and it did reduce not only memory requirement but runtime. Looking forward to the release with less memory requirement, because it still needs more than 2.0 TB memory with some larger datasets.

Wenhhao avatar Jan 07 '23 16:01 Wenhhao

The latest version is 4.7 Version 4.2 looks like quite old, but not sure if memory requirements changed since 4.2

Get Outlook for iOShttps://aka.ms/o0ukef


From: Wenhhao @.> Sent: Saturday, January 7, 2023 11:55:30 AM To: Nesvilab/FragPipe @.> Cc: Nesvizhskii, Alexey @.>; Comment @.> Subject: Re: [Nesvilab/FragPipe] Philosopher cannot create report files due to unknown reason. It also needs more than 2 TB memory with 280 GB interact.pep.xml files. (Issue #669)

External Email - Use Caution

Hi @Wenhhaohttps://github.com/Wenhhao ,

We are revisiting the memory issue in Philosopher. May I ask if you have tried the latest Philosopher, and if the memory requirement reduced?

Thanks,

Fengchao

Hi @fcyuhttps://github.com/fcyu, We tried Philosopher v4.2.2, and it did reduce not only memory requirement but runtime. Looking forward to the release with less memory requirement, because it still needs more than 2.0 TB memory with some larger datasets.

— Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/669#issuecomment-1374539111, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM62DIPKNCU7LW2R5UBLWRGNYFANCNFSM5VKVCGWA. You are receiving this because you commented.Message ID: @.***>


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

anesvi avatar Jan 07 '23 17:01 anesvi

@anesvi there probably isn't much change to the memory requirements from 4.2 to 4.7, assuming @prvst did not further optimize it since 4.2.

guoci avatar Jan 07 '23 19:01 guoci

The subsequent changes were feature-related, and not related to performance.

prvst avatar Jan 08 '23 15:01 prvst

Hi @fcyu , Here is some detail about a Philosopher Filter error caused by incomplete ProteinProphet result as I mentioned before.

With larger dataset, Philosopher Filter was interrupted and threw XML syntax error: image

And the end of prot.xml file was truly incomplete: image

It seems to be an insufficient memory issue, since the complete result could be created with more memory. But in previous versions, similar error was occurred while running ProteinProphet.

The log file is attached here.

Wenhhao avatar Jan 10 '23 08:01 Wenhhao

It appears that ProteinProphet is failing silently, or the error message is not being captured.

prvst avatar Jan 10 '23 14:01 prvst

Hi @Wenhhao ,

Thank you very much for your updated information.

It looks like ProteinProphet was stopped or crashed in the middle of the task. If ProteinProphet finished successfully, Philosopher would print Process 'ProteinProphet' finished, exit code: 0. This message was not in your log file. It looks like Philosopher failed to capture the error message or return code. It also could not return non-zero code to let FragPipe know that ProteinProphet crashed.

I think a more critical question is that why ProteinProphet crashed. If it is due to the insufficient memory, I am not sure how to solve it easily since ProteinProphet is maintained by other group.

Best,

Fengchao

fcyu avatar Jan 10 '23 15:01 fcyu

I also do not know why ProteinPropher would crash. I did notice that they used unreviewed UniProt sequences. I suggest trying a smaller database like reviewed UniProt

Get Outlook for iOShttps://aka.ms/o0ukef


From: Fengchao @.> Sent: Tuesday, January 10, 2023 10:33:59 AM To: Nesvilab/FragPipe @.> Cc: Nesvizhskii, Alexey @.>; Mention @.> Subject: Re: [Nesvilab/FragPipe] Philosopher cannot create report files due to unknown reason. It also needs more than 2 TB memory with 280 GB interact.pep.xml files. (Issue #669)

External Email - Use Caution

Hi @Wenhhaohttps://github.com/Wenhhao ,

Thank you very much for your updated information.

It looks like ProteinProphet was stopped or crashed in the middle of the task. If ProteinProphet finished successfully, Philosopher would print Process 'ProteinProphet' finished, exit code: 0. This message was not in your log file. It looks like Philosopher failed to capture the error message or return code. It also could not return non-zero code to let FragPipe know that ProteinProphet crashed.

I think a more critical question is that why ProteinProphet crashed. If it is due to the insufficient memory, I am not sure how to solve it easily since ProteinProphet is maintained by other group.

Best,

Fengchao

— Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/669#issuecomment-1377452142, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM65OBFTWE4YGRZR7CKDWRV6OPANCNFSM5VKVCGWA. You are receiving this because you were mentioned.Message ID: @.***>


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

anesvi avatar Jan 10 '23 15:01 anesvi