Generating upcalls fills up java CodeHeap and can lead to OutOfMemoryException
This one is a bit more invasive, and I'm a bit out of my depth for fixing it myself.
More easily reproduced when setting a smaller codeheap size, using options like -XX:NonProfiledCodeHeapSize=10M -XX:ProfiledCodeHeapSize=10M -XX:NonNMethodCodeHeapSize=8M
The individual values as they change can be viewed in the Memory tab of a JMX session like JDK Mission Control (jmc). When all 3 fill up (CodeHeap 'profiled nmethods', CodeHeap 'non-nmethods', and CodeHeap 'non-profiled nmethods'), an OutOfMemoryException will be thrown.
[5323.954s][warning][codecache] CodeHeap 'non-nmethods' is full. Compiler has been disabled.
[5323.954s][warning][codecache] Try increasing the code heap size using -XX:NonNMethodCodeHeapSize=
OpenJDK 64-Bit Server VM warning: CodeHeap 'non-nmethods' is full. Compiler has been disabled.
OpenJDK 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonNMethodCodeHeapSize=
CodeHeap 'non-profiled nmethods': size=119172Kb used=119171Kb max_used=119171Kb free=0Kb
bounds [0x00007cfcf7f9f000, 0x00007cfcff400000, 0x00007cfcff400000]
CodeHeap 'profiled nmethods': size=119164Kb used=119163Kb max_used=119163Kb free=0Kb
bounds [0x00007cfcf0400000, 0x00007cfcf785f000, 0x00007cfcf785f000]
CodeHeap 'non-nmethods': size=7424Kb used=7423Kb max_used=7423Kb free=0Kb
bounds [0x00007cfcf785f000, 0x00007cfcf7f9f000, 0x00007cfcf7f9f000]
CodeCache: size=245760Kb, used=245757Kb, max_used=245757Kb, free=0Kb
total_blobs=322192, nmethods=2008, adapters=1272, full_count=1
Compilation: disabled (not enough contiguous free space left), stopped_count=1, restarted_count=0
[5324.272s][warning][codecache] CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
[5324.272s][warning][codecache] Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
OpenJDK 64-Bit Server VM warning: CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
OpenJDK 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
CodeHeap 'non-profiled nmethods': size=119172Kb used=119171Kb max_used=119171Kb free=0Kb
bounds [0x00007cfcf7f9f000, 0x00007cfcff400000, 0x00007cfcff400000]
CodeHeap 'profiled nmethods': size=119164Kb used=119163Kb max_used=119163Kb free=0Kb
bounds [0x00007cfcf0400000, 0x00007cfcf785f000, 0x00007cfcf785f000]
CodeHeap 'non-nmethods': size=7424Kb used=7423Kb max_used=7423Kb free=0Kb
bounds [0x00007cfcf785f000, 0x00007cfcf7f9f000, 0x00007cfcf7f9f000]
CodeCache: size=245760Kb, used=245757Kb, max_used=245757Kb, free=0Kb
total_blobs=322193, nmethods=2009, adapters=1272, full_count=570
Compilation: disabled (not enough contiguous free space left), stopped_count=1, restarted_count=0
java.lang.AssertionError: java.lang.InternalError: java.lang.NoSuchMethodException: no such method: java.lang.invoke.MethodHandle.linkToSpecial(Object,long,long,int,long,long,MemberName)void/invokeStatic
Classes extending io.github.jwharm.javagi.base.FunctionPointer like org.gnome.glib.SourceFunc require memory in the CodeHeap whenever they create a new upcall stub. Instead of creating a new upcall stub for call to functions like GLib.idleAdd, a generic static upcall stub can be created and the MemorySegment passed to the upcall can be used to reference the actual SourceFunc, similar to how Arenas.close_cb works.
I've tested that manually replacing calls to GLib.idleAdd with this and avoid the OutOfMemoryException:
private static final Map<Integer, SourceFunc> FUNCS = new HashMap<>();
/**
* The upcall stub for the timeouAddSeconds callback method
*/
public static final MemorySegment NEXT_SOURCE_FUNC;
// Allocate the upcall stub for the SourceFunc callback method
static {
try {
FunctionDescriptor _fdesc = FunctionDescriptor.of(ValueLayout.JAVA_INT, ValueLayout.ADDRESS);
MethodHandle _handle = MethodHandles.lookup().findStatic(SlideShowPlaylistSprite.class, "upcall",
_fdesc.toMethodType());
NEXT_SOURCE_FUNC = Linker.nativeLinker().upcallStub(_handle, _fdesc, Arena.global());
} catch (NoSuchMethodException | IllegalAccessException e) {
throw new RuntimeException(e);
}
}
/**
* The {@code upcall} method is called from native code. The parameters
* are marshaled and {@link #run} is executed.
*/
public static int upcall(MemorySegment userData) {
int hashCode = userData.reinterpret(ValueLayout.JAVA_INT.byteSize()).get(ValueLayout.JAVA_INT, 0);
SourceFunc func = FUNCS.get(hashCode);
if (func != null) {
var _result = func.run();
if (!_result) {
FUNCS.remove(hashCode);
}
return _result ? 1 : 0;
}
return 0;
}
@FunctionalInterface
@Generated("io.github.jwharm.JavaGI")
public static interface SourceFunc {
/**
* Specifies the type of function passed to {@link GLib#timeoutAdd},
* {@code GLib#timeoutAddFull}, {@link GLib#idleAdd}, and
* {@code GLib#idleAddFull}.
* <p>
* When calling {@link Source#setCallback}, you may need to cast a
* function of a different type to this type. Use {@code GLib#SOURCEFUNC} to
* avoid warnings about incompatible function types.
*/
boolean run();
}
static final MethodHandle g_timeout_add_seconds_full = Interop
.downcallHandle(
"g_timeout_add_seconds_full", FunctionDescriptor.of(ValueLayout.JAVA_INT, ValueLayout.JAVA_INT,
ValueLayout.JAVA_INT, ValueLayout.ADDRESS, ValueLayout.ADDRESS, ValueLayout.ADDRESS),
false);
public static int timeoutAddSeconds(int priority, int interval, SourceFunc function) {
try (var _arena = Arena.ofConfined()) {
final Arena _functionScope = Arena.ofShared();
int _result;
try {
int hashCode = _functionScope.hashCode();
FUNCS.put(hashCode, function);
_result = (int) g_timeout_add_seconds_full.invokeExact(priority, interval,
(MemorySegment) (function == null ? MemorySegment.NULL : NEXT_SOURCE_FUNC),
Arenas.cacheArena(_functionScope), Arenas.CLOSE_CB_SYM);
} catch (Throwable _err) {
throw new AssertionError(_err);
}
int _returnValue = _result;
return _returnValue;
}
}
Your proposed solution should work, but it will require large changes in a very complex part of the code generator. I'd really prefer not to go there... And I don't see why the current implementation wouldn't work:
GLib.idleAddcallsg_idle_add_full. Looking at the GIR file, the SourceFunc parameter has "notified" scope. This means the "notify" callback parameter is called immediately after the SourceFunc has finished.- In Java-GI, the "notify" callback is
Arenas.CLOSE_CB_SYM, and it closes the arena of SourceFunc's upcall stub. - The JVM will deallocate the upcall stub when the arena is closed.
I think the logic is sound, so I can't really explain your OOM exception. Can you investigate if the CLOSE_CB_SYM is run for your SourceFuncs?
I can take another look to reconfirm. I believe that CLOSE_CB_SYM is run properly because I had regular memory issues when dealing with lambdas rather than passing in static classes as SourceFunc. Regular heap usage isn't increasing significantly -- it's the CodeHeap specifically, which by default on my machine has maximums of 116MiB 'profiled-nmethods', 7.25MiB 'non-nmethods', and 116MiB 'non-profiled nmethods'. It normally takes much longer to fill those, which is why I shared the options for limiting them. I also ran with -XX:+ClassUnloading -XX:+UseCodeCacheFlushing in hopes that those caches would clear up, but they didn't.
Can you provide a minimal reproducable testcase so I can investigate?
I'm working on a testcase now. I have one that doesn't exhibit this problem, so it looks like this may be the symptom of a different issue and not leaking generally. I'm comparing to my failing case to see where it diverges.
I should be able to isolate it in the next couple of days.
package com.bwackninja;
import java.lang.foreign.MemorySegment;
import org.gnome.gdk.Paintable;
import org.gnome.gdk.Snapshot;
import org.gnome.gio.ApplicationFlags;
import org.gnome.glib.GLib;
import org.gnome.glib.Type;
import org.gnome.gobject.GObject;
import org.gnome.gtk.Application;
import org.gnome.gtk.ApplicationWindow;
import org.gnome.gtk.Fixed;
import org.gnome.gtk.Picture;
import org.gnome.gtk.Window;
import org.gnome.pango.Context;
import io.github.jwharm.javagi.gobject.types.Types;
public class Test {
public static class Canvas extends GObject implements Paintable {
public Canvas(MemorySegment address) {
super(address);
}
public static Type gtype = Types.register(Canvas.class);
@Override
public void snapshot(Snapshot snapshot, double width, double height) {
try {
if (snapshot instanceof org.gnome.gtk.Snapshot gsnapshot) {
gsnapshot.save();
gsnapshot.restore();
}
} catch (Exception e) {
e.printStackTrace();
}
}
public static Canvas create() {
Canvas ret = GObject.newInstance(Canvas.gtype);
return ret;
}
@Override
public int getIntrinsicWidth() {
return 1920;
}
@Override
public int getIntrinsicHeight() {
return 1080;
}
}
public static void runApp(Application application) {
var window = (Window) ApplicationWindow.builder()
.setApplication(application)
.setDefaultWidth(1920)
.setDefaultHeight(1080)
.build();
var draw = Canvas.create();
var img = new Picture();
img.setPaintable(draw);
img.addTickCallback(( _, _) -> {
draw.invalidateContents();
return GLib.SOURCE_CONTINUE;
});
Fixed fixed = new Fixed();
fixed.put(img, 0, 0);
img.setSizeRequest(1920, 1080);
window.setChild(fixed);
window.present();
}
public static void main(String[] args) {
var app = new Application("com.bwackninja.Test",
ApplicationFlags.DEFAULT_FLAGS);
app.onActivate(() -> runApp(app));
app.run(args);
}
}
Running this with -XX:NonProfiledCodeHeapSize=10M -XX:ProfiledCodeHeapSize=10M -XX:NonNMethodCodeHeapSize=8M will complain about a full CodeHeap within 10 minutes. Commenting out draw.invalidateContents() stays stable. A GLib.idleAdd doesn't permanently add to the CodeHeap, but will fail being unable to allocate the upcall after the CodeHeap is full. I didn't hit this issue when I forgot to add the Fixed to the Window.
I'm able to reproduce the issue, but haven't found what is clogging the CodeHeap yet. I used jcmd to repeatedly dump the size and contents of the CodeHeap into a text file until the OOM occured, but I didn't see anything out of the ordinary. The amount of free space fluctuates a bit, but doesn't show a clear trend, until after 8-10 minutes it suddenly goes to zero.