modules icon indicating copy to clipboard operation
modules copied to clipboard

[FEATURE] Disabling JVM Hotspot in modules for JAVA tools

Open lfearnley opened this issue 2 years ago • 6 comments

Is your feature request related to a problem? Please describe

I've encountered a problem with the JVM Hotspot for GATK processes when multiple GATK processes are run on the same node in singularity containers (details in nf-sarek issue #1030). There's also a recent Sarek issue with SIGBUS errors related to Hotspot (nf-sarek issue #1024).

Describe the solution you'd like

I'd like to proposed turning HotSpot off using -XX:-UsePerfData in the --java-options passed to GATK.

This has two effects - it should eliminate a class of bugs related to the JVM and hsperfdata, as well as stabilising nf-core Singularity modules in rare and hard-to-debug situations.

Describe alternatives you've considered

Hotspot is hard-coded in the JVM to write files to /tmp. It ignores the --tmp-dir flag passed to GATK.

As far as I can tell turning this off has no negative side effects beyond preventing the use of jstat and certain Java debuggers which don't seem to be used in nf-core. This detailed blog post from Evan Jones describes an improvement to Java GC efficiency from turning this system off.

Alternatives would include preventing singularity from mounting host /tmp into the container (I'm not certain how this might be achieved within nf-core), or using -XX:+PerfDisableSharedMem.

Additional context

I'm currently trialling nf-sarek with the -XX:-UsePerfData java option on ~100 human WGS and will update on stability.

lfearnley avatar May 24 '23 23:05 lfearnley

Disabling JVM hotspot works to patch these out for GATK, but this can also be triggered by other some Java applications (such as picard commands run in nf-raredisease) are also causing this behaviour. -XX:-UsePerfData is stable in my experience across ~200 runs of Sarek.

lfearnley avatar Sep 17 '23 03:09 lfearnley

ok, so picard should be patched as well, I'll do that in a separate PR then...

maxulysse avatar Sep 17 '23 11:09 maxulysse

It may also be an issue for fastqc. It's happening to others so the patches are incredibly useful (https://github.com/nf-core/sarek/issues/1030), but I'm wondering if this is worth tagging with the nextflow devs as it seems to be a common issue.

lfearnley avatar Sep 17 '23 11:09 lfearnley

Changed the name of the issue and kept it open, so that we can track other JAVA tools. all gatk4 modules have been patched (cf #3844), and we have a PR in sarek; https://github.com/nf-core/sarek/pull/1240

maxulysse avatar Sep 18 '23 07:09 maxulysse

Great, thanks!

I'm trying out setting the _JAVA_OPTS environment variable for fastqc, which seems promising so far.

On Mon, 18 Sept 2023, 5:10 pm Maxime U Garcia, @.***> wrote:

Changed the name of the issue and kept it open, so that we can track other JAVA tools. all gatk4 modules have been patched (cf #3844 https://github.com/nf-core/modules/pull/3844), and we have a PR in sarek; nf-core/sarek#1240 https://github.com/nf-core/sarek/pull/1240

— Reply to this email directly, view it on GitHub https://github.com/nf-core/modules/issues/3455#issuecomment-1722860130, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC25LCIPK2P7OLWUZEXQEPDX27XWVANCNFSM6AAAAAAYOBKJSU . You are receiving this because you authored the thread.Message ID: @.***>

lfearnley avatar Sep 18 '23 07:09 lfearnley

I attempted to set up JAVA_TOOLS_OPTIONS and JAVA_OPTS in fgbio processes, but it did not resolve the issue. Fortunately, fgbio accepts direct parsing of -XX:-UsePerfData.

clbenoit avatar Mar 29 '24 08:03 clbenoit

For completeness, you may need to set '_JAVA_OPTIONS' as well as 'JAVA_TOOLS_OPTIONS' and 'JAVA_OPTS'; https://stackoverflow.com/questions/28327620/difference-between-java-options-java-tool-options-and-java-opts has some more details on this.

lfearnley avatar Mar 29 '24 09:03 lfearnley

@lfearnley Is this still an open issue? Or do we need to add this to the documentation somewhere (looking at @mashehu for that if thats the case).

famosab avatar Mar 13 '25 10:03 famosab