cpg icon indicating copy to clipboard operation
cpg copied to clipboard

cpg-neo4j - Java heap space error while exporting CPG for large repo's.

Open torque59 opened this issue 7 months ago • 4 comments

Issue:

While running cpg-neo4j to create a CPG export against a codebase like https://github.com/OpenOLAT/OpenOLAT results in a heap space error.

System Information

  • Linux Ubuntu-2404-noble-amd64-base
  • Java Info:
openjdk 21.0.8 2025-07-15
OpenJDK Runtime Environment (build 21.0.8+9-Ubuntu-0ubuntu124.04.1)
OpenJDK 64-Bit Server VM (build 21.0.8+9-Ubuntu-0ubuntu124.04.1, mixed mode, sharing)

Steps to reproduce:

  • Build cpg-neo4j (ensure to have the appropriate backends enabled java in this case) and clone OpenOLAT.
  • Run the command below to export the cpg.
cpg-neo4j/build/install/cpg-neo4j/bin/cpg-neo4j /home/tester/java-repos/OpenOLAT/src/main/java/ --exclusion-patterns /home/tester/java-repos/OpenOLAT/src/main/java/org/olat/repository/ --no-neo4j --export-json results.json

After a while you should be able to see the following error:

22:05:16,383 INFO  MeasurementHolder TranslationManager: Translation into full graph done in 2452231 ms
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
        at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
        at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
        at de.fraunhofer.aisec.cpg_vis_neo4j.Application.call(Application.kt:588)
        at de.fraunhofer.aisec.cpg_vis_neo4j.Application.call(Application.kt:90)
        at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
        at picocli.CommandLine.access$1500(CommandLine.java:148)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
        at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
        at picocli.CommandLine.execute(CommandLine.java:2170)
        at de.fraunhofer.aisec.cpg_vis_neo4j.ApplicationKt.main(Application.kt:629)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.base/java.lang.StringLatin1.toBytes(StringLatin1.java:745)
        at java.base/java.lang.String.valueOf(String.java:4556)
        at com.github.javaparser.LineEndingProcessingProvider.read(LineEndingProcessingProvider.java:123)
        at com.github.javaparser.SimpleCharStream.streamRead(SimpleCharStream.java:37)
        at com.github.javaparser.AbstractCharStream.fillBuff(AbstractCharStream.java:292)
        at com.github.javaparser.AbstractCharStream.readChar(AbstractCharStream.java:388)
        at com.github.javaparser.GeneratedJavaParserTokenManager.jjMoveNfa_0(GeneratedJavaParserTokenManager.java:2466)
        at com.github.javaparser.GeneratedJavaParserTokenManager.jjMoveStringLiteralDfa0_0(GeneratedJavaParserTokenManager.java:451)
        at com.github.javaparser.GeneratedJavaParserTokenManager.getNextToken(GeneratedJavaParserTokenManager.java:3039)
        at com.github.javaparser.GeneratedJavaParser.jj_scan_token(GeneratedJavaParser.java:14340)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_129(GeneratedJavaParser.java:12048)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_239(GeneratedJavaParser.java:11926)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_196(GeneratedJavaParser.java:11899)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_481(GeneratedJavaParser.java:11715)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_476(GeneratedJavaParser.java:11544)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_466(GeneratedJavaParser.java:11415)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_455(GeneratedJavaParser.java:11200)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_433(GeneratedJavaParser.java:11036)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_422(GeneratedJavaParser.java:10518)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_399(GeneratedJavaParser.java:10310)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_374(GeneratedJavaParser.java:10129)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_345(GeneratedJavaParser.java:10047)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_285(GeneratedJavaParser.java:9917)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_230(GeneratedJavaParser.java:9770)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_173(GeneratedJavaParser.java:9696)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_105(GeneratedJavaParser.java:13966)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_312(GeneratedJavaParser.java:9649)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_253(GeneratedJavaParser.java:14133)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_211(GeneratedJavaParser.java:14116)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_293(GeneratedJavaParser.java:13598)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_236(GeneratedJavaParser.java:13586)
        at com.github.javaparser.GeneratedJavaParser.jj_3R_188(GeneratedJavaParser.java:13549)

torque59 avatar Sep 14 '25 20:09 torque59

Hi @torque59,

thanks for the detailed report. Just a quick question, so we can better reproduce your problem: I assume you used the main branch of OpenOLAT? I'm surprised how the project should so big that the javaparser already runs out of memory before the heavy part of the CPG construction even starts. Could you maybe check how much heap space your jvm gets assigned?

KuechA avatar Sep 15 '25 09:09 KuechA

@KuechA Yes that is correct, it is on the main branch. One more thing i forgot to mention was for the heap space i ran with the following arguments to allocate sufficient heap space - export _JAVA_OPTIONS="-Xmx64G -Xms32G -Xss1G" and ran into the same issue. I have a 128GB machine.

As for the heap space occupied, let me check on it on the next run.

torque59 avatar Sep 15 '25 16:09 torque59

just a quick update: I could reproduce the bug but I haven't managed to look into it in depth. interestingly, i could analyze the last file in the list of analyzed files as a standalone file but the neo4j export crashes in this case even if it's neither overly complex nor large.

KuechA avatar Sep 17 '25 13:09 KuechA

Hi @torque59 :)

have you retried and checked on the heap space you occupied? Does the problem persist for you or did you find a solution/workaround?

konradweiss avatar Sep 30 '25 12:09 konradweiss