FragPipe icon indicating copy to clipboard operation
FragPipe copied to clipboard

split_peptide_index_tempdir is deleted by Windows before MSFragger finishing

Open kguehrs opened this issue 3 years ago • 13 comments

I tried to generate a spectrum library from DIA runs on a Thermo QExactiveHF by DIA Umpire using the workflow from the selection box in the workflow menu. I followed the procedure described in the tutorial (https://msfragger.nesvilab.org/tutorial_DIA.html) to setup the analysis. I started with avery large number of runs and the systems crashed after a while with the "java.lang.OutOfMemoryError: Java heap space" error statement. I drastically reduced the number of files for the second attempt to mak sure that this is not the reason for the crash but the profram crashed again with the same error message.

I cannot find any paragraph or section in the Umpre parameter where I can change the size of the heap space and I want to ask you if there is a location in the parameter files to change or insert a suited parameter definition.

I drag one of the log files of the failing runs below. log_2021-06-08_16-11-58.txt

kguehrs avatar Jun 09 '21 08:06 kguehrs

FragPipe automatically detects the available free memory if you put 0 to the RAM: image

You can also give a nonzero value, but if your computer does not have enough memory, increasing the value won't help.

Best,

Fengchao

fcyu avatar Jun 09 '21 13:06 fcyu

Hello Fengchao,

I am not sure whether the problem is insufficient memory as the error message specifically points to the heap space which is somewhat special for Java as far as I understood when reading some post in “stackoverflow”. It was mentioned that one can define the size by commands –Xmx and/or XX:MaxPermSize specifying JVM parameters. As I am not familiar with the organization of Java environments I do not know where to modify this type of settings. In the Umpire parameters I could not find anything related to this and I do not know whether one can insert there an additional command line at some position.

Related to use MSFragger in the FragPipe package I tried to do a simple search I did run in a n other problem. When splitDB is used the program creates a temp directory “split_peptide_index_tempdir”. In my case the program crashed due to the fact that this directory, which was present during the MSFragger analysis of the raw-files, was removed by the system and could not be found by the next stages of the program which obviously was looking for this directory. I could see this as I followed the route of data processing in the run tab on my screen. I find this somewhat mysterious as the tempdir was obviously removed before required data were read from it.

Mit freundlichen Grüßen/Best regards

Dr. Karl-Heinz Gührs CF Proteomics

Leibniz Institute on Aging – Fritz Lipmann Institute (FLI) Beutenbergstraße 11 07745 Jena, Germany Phone: +49(0)3641-65-6433 Email: @.*** WWW: http://www.leibniz-fli.de Scientific Director: Prof. Dr. Alfred Nordheim Administrative Director: Dr. Daniele Barthel Chairman of the Board of Trustees: Burkhard Zinner Register of Associations: No. 230296 at Amtsgericht Jena VAT No.: DE 153 925 464

From: Fengchao @.> Sent: Wednesday, June 9, 2021 15:26 To: Nesvilab/FragPipe @.> Cc: Karl-Heinz Gührs @.>; Author @.> Subject: [External] Re: [Nesvilab/FragPipe] Exception in thread "main" java.lang.OutOfMemoryError: Java heap space running DIA-Umpire for library generation (#389)

FragPipe automatically detects the available free memory if you put 0 to the RAM: [image]https://user-images.githubusercontent.com/6926299/121363011-8eeed100-c904-11eb-9369-58d476eaac15.png

You can also give a nonzero value, but if your computer does not have enough memory, increasing the value won't help.

Best,

Fengchao

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/389#issuecomment-857692449, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE7U6YWHIQF2VQPMXEGA5QDTR5TW7ANCNFSM46LQZNDA.

kguehrs avatar Jun 09 '21 17:06 kguehrs

Hi Karl-Heinz,

Yes, Java uses heap space to hold most of the data, which is physically stored in memory. If you see that there is a not enough heap space error, it means that there is not enough memory or you specify too little heap space to java (using -Xmx). In FragPipe, it automatically detects the available free memory and fills the value for -Xmx. As I mentioned in the last reply, you can change that value in FragPipe.

In the Umpire parameters I could not find anything related to this and I do not know whether one can insert there an additional command line at some position.

If you search your log with "-Xmx", you will find "-Xmx9G", which is appended by FragPipe.

As to the split database error, I will let @guoci to take a look.

Best,

Fengchao

fcyu avatar Jun 09 '21 18:06 fcyu

@kguehrs Regarding the splitDB issue, there was another user report on the same issue and I am still investigating it. It looks like your computer is deleting some folders and files while the program is running. My best guess is that some antivirus software (.e.g. Windows defender) is doing the deletion. If you can view the history of deleted files you might see it there. What antivirus software are you running, if any?

guoci avatar Jun 09 '21 18:06 guoci

Hello,

Thanks for the reply. There is Kaspersky Internet Security that is running on my PC. I looked up the history and I could not see any file deletion in the reports. Fragpipe is put to a kind of box for programs with somewhat limited rights but this is the same how this program handles R, python, perseus, skyline and some more. As I mentioned, there was no notice about deleting or transferring any program or folder to a kind of quarantine folder and usually the program records any of such events.

I also briefly looked up the windows event logs and I could not find anything as an error or warning. There are a lot of entries marked as information that I did not thoroughly searched. Unfortunately, there is no opportunity to directly search the event logs in windows and one need to export and search in notepad++ or similar and I do not have time at the moment to do this.

Mit freundlichen Grüßen/Best regards

Dr. Karl-Heinz Gührs CF Proteomics

Leibniz Institute on Aging – Fritz Lipmann Institute (FLI) Beutenbergstraße 11 07745 Jena, Germany Phone: +49(0)3641-65-6433 Email: @.*** WWW: http://www.leibniz-fli.de Scientific Director: Prof. Dr. Alfred Nordheim Administrative Director: Dr. Daniele Barthel Chairman of the Board of Trustees: Burkhard Zinner Register of Associations: No. 230296 at Amtsgericht Jena VAT No.: DE 153 925 464

From: guoci @.> Sent: Wednesday, June 9, 2021 20:15 To: Nesvilab/FragPipe @.> Cc: Karl-Heinz Gührs @.>; Mention @.> Subject: [External] Re: [Nesvilab/FragPipe] split_peptide_index_tempdir is deleted by Windows before MSFragger finishing (#389)

@kguehrshttps://github.com/kguehrs Regarding the splitDB issue, there was another user report on the same issue and I am still investigating it. It looks like your computer is deleting some folders and files while the program is running. My best guess is that some antivirus software (.e.g. Windows defender) is doing the deletion. If you can view the history of deleted files you might see it there. What antivirus software are you running, if any?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/389#issuecomment-857936168, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE7U6YW6D2OYGMSKSGUTHCTTR6VQNANCNFSM46LQZNDA.

kguehrs avatar Jun 10 '21 07:06 kguehrs

Hello Fengchao,

Thanks for the kind reply and explanation. I will try to repeat the search with dedicated memory settings (not 0 = auto) even less files. I presently run a computer in homeoffice with 16 i7 cores and 16 GB RAM. My real concern is the database I need to use. I created this database by combining entries from an in-house database that has ENSEMBL structure and entries from Uniprot related to this special organism. After filtering the duplicate entries the database is still large (55 MB) and will further increase if one adds decoys. I guess that you lock the database indices in memory and this might get rather large. Is there any chance to swap this part to HD, which would slow the process but free some memory.

Mit freundlichen Grüßen/Best regards

Dr. Karl-Heinz Gührs CF Proteomics

Leibniz Institute on Aging – Fritz Lipmann Institute (FLI) Beutenbergstraße 11 07745 Jena, Germany Phone: +49(0)3641-65-6433 Email: @.*** WWW: http://www.leibniz-fli.de Scientific Director: Prof. Dr. Alfred Nordheim Administrative Director: Dr. Daniele Barthel Chairman of the Board of Trustees: Burkhard Zinner Register of Associations: No. 230296 at Amtsgericht Jena VAT No.: DE 153 925 464

From: Fengchao @.> Sent: Wednesday, June 9, 2021 20:04 To: Nesvilab/FragPipe @.> Cc: Karl-Heinz Gührs @.>; Author @.> Subject: [External] Re: [Nesvilab/FragPipe] Exception in thread "main" java.lang.OutOfMemoryError: Java heap space running DIA-Umpire for library generation (#389)

Hi Karl-Heinz,

Yes, Java uses heap space to hold most of the data, which is physically stored in memory. If you see that there is a not enough heap space error, it means that there is not enough memory or you specify too little heap space to java (using -Xmx). In FragPipe, it automatically detects the available free memory and fills the value for -Xmx. As I mentioned in the last reply, you can change that value in FragPipe.

In the Umpire parameters I could not find anything related to this and I do not know whether one can insert there an additional command line at some position.

If you search your log with "-Xmx", you will find "-Xmx9G", which is appended by FragPipe.

As to the split database error, I will let @guocihttps://github.com/guoci to take a look.

Best,

Fengchao

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/389#issuecomment-857923696, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE7U6YT5OBLD4J5YSDBMCA3TR6UJFANCNFSM46LQZNDA.

kguehrs avatar Jun 10 '21 08:06 kguehrs

Is the memory issue in MSFragger? DIA-Umpire does not do anything with the database. This must ne MSFragger issue?

From: kguehrs @.> Sent: Thursday, June 10, 2021 4:15 AM To: Nesvilab/FragPipe @.> Cc: Subscribed @.***> Subject: Re: [Nesvilab/FragPipe] split_peptide_index_tempdir is deleted by Windows before MSFragger finishing (#389)

External Email - Use Caution Hello Fengchao,

Thanks for the kind reply and explanation. I will try to repeat the search with dedicated memory settings (not 0 = auto) even less files. I presently run a computer in homeoffice with 16 i7 cores and 16 GB RAM. My real concern is the database I need to use. I created this database by combining entries from an in-house database that has ENSEMBL structure and entries from Uniprot related to this special organism. After filtering the duplicate entries the database is still large (55 MB) and will further increase if one adds decoys. I guess that you lock the database indices in memory and this might get rather large. Is there any chance to swap this part to HD, which would slow the process but free some memory.

Mit freundlichen Grüßen/Best regards

Dr. Karl-Heinz Gührs CF Proteomics

Leibniz Institute on Aging – Fritz Lipmann Institute (FLI) Beutenbergstraße 11 07745 Jena, Germany Phone: +49(0)3641-65-6433 Email: @.***mailto:***@***.*** WWW: http://www.leibniz-fli.de Scientific Director: Prof. Dr. Alfred Nordheim Administrative Director: Dr. Daniele Barthel Chairman of the Board of Trustees: Burkhard Zinner Register of Associations: No. 230296 at Amtsgericht Jena VAT No.: DE 153 925 464

From: Fengchao @.mailto:***@***.***> Sent: Wednesday, June 9, 2021 20:04 To: Nesvilab/FragPipe @.mailto:***@***.***> Cc: Karl-Heinz Gührs @.mailto:***@***.***>; Author @.mailto:***@***.***> Subject: [External] Re: [Nesvilab/FragPipe] Exception in thread "main" java.lang.OutOfMemoryError: Java heap space running DIA-Umpire for library generation (#389)

Hi Karl-Heinz,

Yes, Java uses heap space to hold most of the data, which is physically stored in memory. If you see that there is a not enough heap space error, it means that there is not enough memory or you specify too little heap space to java (using -Xmx). In FragPipe, it automatically detects the available free memory and fills the value for -Xmx. As I mentioned in the last reply, you can change that value in FragPipe.

In the Umpire parameters I could not find anything related to this and I do not know whether one can insert there an additional command line at some position.

If you search your log with "-Xmx", you will find "-Xmx9G", which is appended by FragPipe.

As to the split database error, I will let @guocihttps://github.com/guoci to take a look.

Best,

Fengchao

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/389#issuecomment-857923696, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE7U6YT5OBLD4J5YSDBMCA3TR6UJFANCNFSM46LQZNDA.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/389#issuecomment-858413369, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM63WZ4BDPTBHWWJBGTLTSBX7FANCNFSM46LQZNDA.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

anesvi avatar Jun 10 '21 13:06 anesvi

Hi Karl-Heinz

DIA-Umpire processes each run sequentially, so reducing the run number does not help. In your last log, your memory is too small to let DIA-Umpire finish the job. You need to increase your physical memory.

Regarding the database size you mentioned, 55 MB is not a big deal as long as you have enough memory.

Is there any chance to swap this part to HD, which would slow the process but free some memory.

No, we cannot do that.

Best,

Fengchao

fcyu avatar Jun 10 '21 13:06 fcyu

Ok, I see that the memory error is in DIA-Umpire

I also see you changed the defaults. Given that you only have 16Gb RAM, please try with NoMissedScan = 1 (you changed to 2 from our default value of 1) Changing to 2 requires more memory and also makes it slower.

I also see you specified an unusually small mass tolerance, like 5ppm. I think it is too narrow.

I suggest you run with our defaults first, before starting to change the parameters. Hopefully would would have enough memory

Best Alexey

From: kguehrs @.> Sent: Wednesday, June 9, 2021 4:20 AM To: Nesvilab/FragPipe @.> Cc: Subscribed @.***> Subject: [Nesvilab/FragPipe] Exception in thread "main" java.lang.OutOfMemoryError: Java heap space running DIA-Umpire for library generation (#389)

External Email - Use Caution

I tried to generate a spectrum library from DIA runs on a Thermo QExactiveHF by DIA Umpire using the workflow from the selection box in the workflow menu. I followed the procedure described in the tutorial (https://msfragger.nesvilab.org/tutorial_DIA.html) to setup the analysis. I started with avery large number of runs and the systems crashed after a while with the "java.lang.OutOfMemoryError: Java heap space" error statement. I drastically reduced the number of files for the second attempt to mak sure that this is not the reason for the crash but the profram crashed again with the same error message.

I cannot find any paragraph or section in the Umpre parameter where I can change the size of the heap space and I want to ask you if there is a location in the parameter files to change or insert a suited parameter definition.

I drag one of the log files of the failing runs below. log_2021-06-08_16-11-58.txthttps://github.com/Nesvilab/FragPipe/files/6622183/log_2021-06-08_16-11-58.txt

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/389, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM66PKSCEXJXTECKJJSLTR4P2NANCNFSM46LQZNDA.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

anesvi avatar Jun 10 '21 14:06 anesvi

So do we have any resolution to Dr. Karl-Heinz Gührs's issue with Split generated files deleted?

Do we at least know that it is unrelated to the problem that the UCLA group is having with FragPipe with split on Windows?

anesvi avatar Jun 19 '21 20:06 anesvi

No, I have yet to find the cause, the deletion seems to be unpredictable. It is probably related to the problem the UCLA group is having and only happens in Windows. We have only received 2 such reports. @kguehrs can you think of anything about your computer (hardware/software) that is not commonly used?

guoci avatar Jun 24 '21 20:06 guoci

Hello Guoci,

I have installed quite a number of different software on my computer starting from R and python up to text processing such as OpenOffice and Notepad++. There are some mass spec related packages such as maxquant and perseus and skyline. I do not think that there is something very extraordinary.

Mit freundlichen Grüßen/Best regards

Dr. Karl-Heinz Gührs CF Proteomics

Leibniz Institute on Aging – Fritz Lipmann Institute (FLI) Beutenbergstraße 11 07745 Jena, Germany Phone: +49(0)3641-65-6433 Email: @.*** WWW: http://www.leibniz-fli.de Scientific Director: Prof. Dr. Alfred Nordheim Administrative Director: Dr. Daniele Barthel Chairman of the Board of Trustees: Burkhard Zinner Register of Associations: No. 230296 at Amtsgericht Jena VAT No.: DE 153 925 464

From: guoci @.> Sent: Thursday, June 24, 2021 22:32 To: Nesvilab/FragPipe @.> Cc: Karl-Heinz Gührs @.>; Mention @.> Subject: [External] Re: [Nesvilab/FragPipe] split_peptide_index_tempdir is deleted by Windows before MSFragger finishing (#389)

No, I have yet to find the cause, the deletion seems to be unpredictable. It is probably related to the problem the UCLA group is having and only happens in Windows. We have only received 2 such reports. @kguehrshttps://github.com/kguehrs can you think of anything about your computer (hardware/software) that is not commonly used?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/389#issuecomment-867932839, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE7U6YSGWT5MMCBVICMONFLTUOI53ANCNFSM46LQZNDA.

kguehrs avatar Jun 25 '21 07:06 kguehrs

Hi @kguehrs, may I suggest a reinstallation of Anaconda and see if that fixes the issue?

guoci avatar Jul 17 '21 18:07 guoci