ballerina-lang
ballerina-lang copied to clipboard
[Bug]: OOM when building health.hl7v2commons with bal9
Description
https://github.com/ballerina-platform/module-ballerinax-health.hl7v2/tree/main/commons - update the ballerina version and remove those explicit dependencies from the Ballerina.toml.
bal build gives following error;
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3537)
at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:100)
at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:130)
at java.base/java.io.DataOutputStream.write(DataOutputStream.java:112)
at org.wso2.ballerinalang.compiler.bir.writer.BIRBinaryWriter.serialize(BIRBinaryWriter.java:90)
at io.ballerina.projects.ModuleContext.generateBIR(ModuleContext.java:523)
at io.ballerina.projects.ModuleContext.generateCodeInternal(ModuleContext.java:472)
at io.ballerina.projects.ModuleCompilationState$4.generatePlatformSpecificCode(ModuleCompilationState.java:132)
at io.ballerina.projects.ModuleContext.generatePlatformSpecificCode(ModuleContext.java:387)
at io.ballerina.projects.JBallerinaBackend.performCodeGen(JBallerinaBackend.java:173)
at io.ballerina.projects.JBallerinaBackend.<init>(JBallerinaBackend.java:142)
at io.ballerina.projects.JBallerinaBackend.lambda$from$0(JBallerinaBackend.java:126)
at io.ballerina.projects.JBallerinaBackend$$Lambda$405/0x00000008003d57c8.apply(Unknown Source)
at java.base/java.util.HashMap.computeIfAbsent(HashMap.java:1220)
at io.ballerina.projects.PackageCompilation.getCompilerBackend(PackageCompilation.java:179)
at io.ballerina.projects.JBallerinaBackend.from(JBallerinaBackend.java:125)
at io.ballerina.projects.JBallerinaBackend.from(JBallerinaBackend.java:113)
at io.ballerina.cli.task.CompileTask.execute(CompileTask.java:216)
at io.ballerina.cli.TaskExecutor.executeTasks(TaskExecutor.java:40)
at io.ballerina.cli.cmd.BuildCommand.execute(BuildCommand.java:300)
at io.ballerina.cli.launcher.Main$$Lambda$47/0x0000000800131210.accept(Unknown Source)
at java.base/java.util.Optional.ifPresent(Optional.java:178)
at io.ballerina.cli.launcher.Main.main(Main.java:58)
Eclipse memory analyser shows 2 problems;
The thread java.lang.Thread @ 0x7c0749a70 main keeps local variables with total size 136,159,096 (12.67%) bytes.
The memory is accumulated in one instance of java.lang.Thread, loaded by <system class loader>, which occupies 136,159,096 (12.67%) bytes.
main
at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
at java.util.Arrays.copyOf([BI)[B (Arrays.java:3537)
at java.io.ByteArrayOutputStream.ensureCapacity(I)V (ByteArrayOutputStream.java:100)
at java.io.ByteArrayOutputStream.write([BII)V (ByteArrayOutputStream.java:130)
at java.io.DataOutputStream.write([BII)V (DataOutputStream.java:112)
at org.wso2.ballerinalang.compiler.bir.writer.BIRBinaryWriter.serialize()[B (BIRBinaryWriter.java:90)
at io.ballerina.projects.ModuleContext.generateBIR(Lio/ballerina/projects/ModuleContext;Lorg/wso2/ballerinalang/compiler/util/CompilerContext;)Ljava/io/ByteArrayOutputStream; (ModuleContext.java:523)
at io.ballerina.projects.ModuleContext.generateCodeInternal(Lio/ballerina/projects/ModuleContext;Lio/ballerina/projects/CompilerBackend;Lorg/wso2/ballerinalang/compiler/util/CompilerContext;)V (ModuleContext.java:472)
at io.ballerina.projects.ModuleCompilationState$4.generatePlatformSpecificCode(Lio/ballerina/projects/ModuleContext;Lorg/wso2/ballerinalang/compiler/util/CompilerContext;Lio/ballerina/projects/CompilerBackend;)V (ModuleCompilationState.java:132)
at io.ballerina.projects.ModuleContext.generatePlatformSpecificCode(Lorg/wso2/ballerinalang/compiler/util/CompilerContext;Lio/ballerina/projects/CompilerBackend;)V (ModuleContext.java:387)
at io.ballerina.projects.JBallerinaBackend.performCodeGen(Z)V (JBallerinaBackend.java:173)
at io.ballerina.projects.JBallerinaBackend.<init>(Lio/ballerina/projects/PackageCompilation;Lio/ballerina/projects/JvmTarget;Z)V (JBallerinaBackend.java:142)
at io.ballerina.projects.JBallerinaBackend.lambda$from$0(Lio/ballerina/projects/PackageCompilation;Lio/ballerina/projects/JvmTarget;ZLio/ballerina/projects/CompilerBackend$TargetPlatform;)Lio/ballerina/projects/JBallerinaBackend; (JBallerinaBackend.java:126)
at io.ballerina.projects.JBallerinaBackend$$Lambda$425+0x0000000800427a70.apply(Ljava/lang/Object;)Ljava/lang/Object; ()
at java.util.HashMap.computeIfAbsent(Ljava/lang/Object;Ljava/util/function/Function;)Ljava/lang/Object; (HashMap.java:1220)
at io.ballerina.projects.PackageCompilation.getCompilerBackend(Lio/ballerina/projects/CompilerBackend$TargetPlatform;Ljava/util/function/Function;)Lio/ballerina/projects/CompilerBackend; (PackageCompilation.java:179)
at io.ballerina.projects.JBallerinaBackend.from(Lio/ballerina/projects/PackageCompilation;Lio/ballerina/projects/JvmTarget;Z)Lio/ballerina/projects/JBallerinaBackend; (JBallerinaBackend.java:125)
at io.ballerina.projects.JBallerinaBackend.from(Lio/ballerina/projects/PackageCompilation;Lio/ballerina/projects/JvmTarget;)Lio/ballerina/projects/JBallerinaBackend; (JBallerinaBackend.java:113)
at io.ballerina.cli.task.CompileTask.execute(Lio/ballerina/projects/Project;)V (CompileTask.java:216)
at io.ballerina.cli.TaskExecutor.executeTasks(Lio/ballerina/projects/Project;)V (TaskExecutor.java:40)
at io.ballerina.cli.cmd.TestCommand.execute()V (TestCommand.java:377)
at io.ballerina.cli.launcher.Main$$Lambda$47+0x0000000800131210.accept(Ljava/lang/Object;)V ()
at java.util.Optional.ifPresent(Ljava/util/function/Consumer;)V (Optional.java:178)
at io.ballerina.cli.launcher.Main.main([Ljava/lang/String;)V (Main.java:58)
947,983 instances of org.wso2.ballerinalang.compiler.diagnostic.BLangDiagnosticLocation, loaded by jdk.internal.loader.ClassLoaders$AppClassLoader @ 0x7c07390d0 occupy 113,757,984 (10.59%) bytes.
Steps to Reproduce
No response
Affected Version(s)
2201.9.0
OS, DB, other environment details and versions
No response
Related area
-> Compilation
Related issue(s) (optional)
No response
Suggested label(s) (optional)
No response
Suggested assignee(s) (optional)
No response
Can we get update on this? This issue is a blocker for healthcare use cases when using Bal 9.
Can we get update on this? This issue is a blocker for healthcare use cases when using Bal 9.
@sameeragunarathne we are discussing the resolution for this. Will provide an update as soon as possible.
The main root cause of this issue is that when compiling multiple modules, we keep the blangpackage related to the compiled package within memory. As the number of modules increases, this causes an out-of-memory (OOM) error. We retain the blangpackage of the compiled module because it is needed for the test command. As a solution for now, we are going to remove some unnecessary closures that are generated for module-level annotations to reduce the size of blangpackage.
To add to the findings of @chiranSachintha, we clean the bLangPackages of the dependencies after the code generation. We cannot extend this to clean the modules of the user's package since in some commands like bal doc and bal test, we access the bLangPackage after the code generation phase. We could do extensive refactoring to improve this to be able to clean the bLangPackage instances for the user's package also, but IMO, we could only put off the OOM with it. The issue will surface again when the package keeps growing.
If the fix from @chiranSachintha resolves it for now, we can go ahead and do a patch release. At the same time, we would need to work on a proper optimization.
When compiling with a clean central cache, we are experiencing an OOM issue with the Ballerina compiler. This issue doesn’t occur during subsequent compilations. The root cause is that, in the initial compilation, all direct and indirect dependencies of the current package are compiled, creating BIR and Jar files for each dependency package in the cache.
Currently, the compiler driver compiles all these dependencies within the same process, which likely leads to unnecessary memory usage.
From the second compilation onwards, the compiler driver reads the BIR of package dependencies instead of recompiling them.
To address this, what if we make the first compilation behave like the subsequent ones, where it reads the BIR of package dependencies instead of compiling them from scratch? We can achieve this by creating a new OS process to compile each package dependency. Once the process finishes, the compiler driver can read the BIR of that package. This approach should technically resolve the issue.
We improved the memory consumption with https://github.com/ballerina-platform/ballerina-lang/pull/43009. Now ballerinax/health.hl7v2commons package compiles and generates the executable successfully. The first compilation takes about 6 minutes (all healthcare dependencies are compiled from the sources). The subsequent compilations take ~7 seconds.
However, some of the dependencies are consuming closer to 1GB of memory even if it is compiled in a separate process.
Below is a summary of the memory consumption after the fix. If the packages keep growing, then we might experience OOM again soon.
Memory consumed by the main process ~= 250 MB
| Healthcare dependency | Memory consumption approx. (x) |
Total memory consumed by bal build (x + 250MB) |
|---|---|---|
| ballerinax/health.hl7v2:2.2.1 | 73 MB | 323 MB |
| ballerinax/health.hl7v231:3.0.1 | 550 MB | 800 MB |
| ballerinax/health.hl7v23:3.0.2 | 500 MB | 750 MB |
| ballerinax/health.hl7v24:3.0.1 | 550 MB | 800 MB |
| ballerinax/health.hl7v251:3.0.1 | 750 MB | 1000 MB |
| ballerinax/health.hl7v25:3.0.1 | 800 MB | 1050 MB |
| ballerinax/health.hl7v26:3.0.1 | 850 MB | 1100 MB |
| ballerinax/health.hl7v27:3.0.1 | 990 MB | 1240 MB |
| ballerinax/health.hl7v28:3.0.1 | 980 MB | 1230 MB |
My recommendation is to expose this feature with a compiler option, as discussed. Let's mark it as experimental initially.
--optimize-dependency-compilation
[EXPERIMENTAL] Enables memory-efficient compilation of package dependencies using separate processes. This can help prevent out-of-memory issues during initial compilation with a clean central cache.
@nirmal070125 @sameeragunarathne This improvement will be released with 2201.9.2 which is estimated to be released this week. As discussed in https://github.com/ballerina-platform/ballerina-lang/issues/42860#issuecomment-2190795584, this optimization in the compilation is not enabled by default. This can be enabled by passing the --optimize-dependency-compilation flag to the build or adding the build option in the Ballerina.toml file as shown below.
[build-options]
optimizeDependencyCompilation = true
@sameeragunarathne and I tested this by publishing the BallerinaX modules to the local cache and then trying to build the commons module using the locally published modules. However, I am encountering an error: failed to compile ballerinax/health.hl7v23:3.0.4. When I try to access the BallerinaX modules through the central repository, it works fine. @azinneera
Created separate issue to analyze and optimize compile-time memory usage when compiling each module https://github.com/ballerina-platform/ballerina-lang/issues/43125
@sameeragunarathne and I tested this by publishing the BallerinaX modules to the local cache and then trying to build the
commonsmodule using the locally published modules. However, I am encountering an error:failed to compile ballerinax/health.hl7v23:3.0.4. When I try to access the BallerinaX modules through the central repository, it works fine. @azinneera
This is fixed with https://github.com/ballerina-platform/ballerina-lang/pull/43109. We will release it with U9.3 release.
Bellow fixes are added in U10 to improve compile-time memory usage when compiling a module. Reduced ~60MB of heap usage
- Remove creating
HashMapforjarEntriesinJvmPackageGenhttps://github.com/ballerina-platform/ballerina-lang/issues/43222 - Reduce BLangDiagnosticLocation to create LineRange and TextRange only when needed https://github.com/ballerina-platform/ballerina-lang/issues/43223
The below fix is yet to be merged to the master branch which will reduce ~110MB of heap
- Clean syntax tree before codegen https://github.com/ballerina-platform/ballerina-lang/issues/43226
We improved the memory consumption with #43009. Now
ballerinax/health.hl7v2commonspackage compiles and generates the executable successfully. The first compilation takes about 6 minutes (all healthcare dependencies are compiled from the sources). The subsequent compilations take ~7 seconds.However, some of the dependencies are consuming closer to 1GB of memory even if it is compiled in a separate process.
Below is a summary of the memory consumption after the fix. If the packages keep growing, then we might experience OOM again soon.
Memory consumed by the main process ~= 250 MB
Healthcare dependency Memory consumption approx. (x) Total memory consumed by bal build (x + 250MB) ballerinax/health.hl7v2:2.2.1 73 MB 323 MB ballerinax/health.hl7v231:3.0.1 550 MB 800 MB ballerinax/health.hl7v23:3.0.2 500 MB 750 MB ballerinax/health.hl7v24:3.0.1 550 MB 800 MB ballerinax/health.hl7v251:3.0.1 750 MB 1000 MB ballerinax/health.hl7v25:3.0.1 800 MB 1050 MB ballerinax/health.hl7v26:3.0.1 850 MB 1100 MB ballerinax/health.hl7v27:3.0.1 990 MB 1240 MB ballerinax/health.hl7v28:3.0.1 980 MB 1230 MB
@rdulmina shall we add the improvement for each of these to get a clear idea of the improvement?