java-gi icon indicating copy to clipboard operation
java-gi copied to clipboard

Generating upcalls fills up java CodeHeap and can lead to OutOfMemoryException

Open BwackNinja opened this issue 7 months ago • 4 comments

This one is a bit more invasive, and I'm a bit out of my depth for fixing it myself.

More easily reproduced when setting a smaller codeheap size, using options like -XX:NonProfiledCodeHeapSize=10M -XX:ProfiledCodeHeapSize=10M -XX:NonNMethodCodeHeapSize=8M

The individual values as they change can be viewed in the Memory tab of a JMX session like JDK Mission Control (jmc). When all 3 fill up (CodeHeap 'profiled nmethods', CodeHeap 'non-nmethods', and CodeHeap 'non-profiled nmethods'), an OutOfMemoryException will be thrown.

[5323.954s][warning][codecache] CodeHeap 'non-nmethods' is full. Compiler has been disabled.
[5323.954s][warning][codecache] Try increasing the code heap size using -XX:NonNMethodCodeHeapSize=
OpenJDK 64-Bit Server VM warning: CodeHeap 'non-nmethods' is full. Compiler has been disabled.
OpenJDK 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonNMethodCodeHeapSize=
CodeHeap 'non-profiled nmethods': size=119172Kb used=119171Kb max_used=119171Kb free=0Kb
 bounds [0x00007cfcf7f9f000, 0x00007cfcff400000, 0x00007cfcff400000]
CodeHeap 'profiled nmethods': size=119164Kb used=119163Kb max_used=119163Kb free=0Kb
 bounds [0x00007cfcf0400000, 0x00007cfcf785f000, 0x00007cfcf785f000]
CodeHeap 'non-nmethods': size=7424Kb used=7423Kb max_used=7423Kb free=0Kb
 bounds [0x00007cfcf785f000, 0x00007cfcf7f9f000, 0x00007cfcf7f9f000]
CodeCache: size=245760Kb, used=245757Kb, max_used=245757Kb, free=0Kb
 total_blobs=322192, nmethods=2008, adapters=1272, full_count=1
Compilation: disabled (not enough contiguous free space left), stopped_count=1, restarted_count=0
[5324.272s][warning][codecache] CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
[5324.272s][warning][codecache] Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
OpenJDK 64-Bit Server VM warning: CodeHeap 'non-profiled nmethods' is full. Compiler has been disabled.
OpenJDK 64-Bit Server VM warning: Try increasing the code heap size using -XX:NonProfiledCodeHeapSize=
CodeHeap 'non-profiled nmethods': size=119172Kb used=119171Kb max_used=119171Kb free=0Kb
 bounds [0x00007cfcf7f9f000, 0x00007cfcff400000, 0x00007cfcff400000]
CodeHeap 'profiled nmethods': size=119164Kb used=119163Kb max_used=119163Kb free=0Kb
 bounds [0x00007cfcf0400000, 0x00007cfcf785f000, 0x00007cfcf785f000]
CodeHeap 'non-nmethods': size=7424Kb used=7423Kb max_used=7423Kb free=0Kb
 bounds [0x00007cfcf785f000, 0x00007cfcf7f9f000, 0x00007cfcf7f9f000]
CodeCache: size=245760Kb, used=245757Kb, max_used=245757Kb, free=0Kb
 total_blobs=322193, nmethods=2009, adapters=1272, full_count=570
Compilation: disabled (not enough contiguous free space left), stopped_count=1, restarted_count=0
java.lang.AssertionError: java.lang.InternalError: java.lang.NoSuchMethodException: no such method: java.lang.invoke.MethodHandle.linkToSpecial(Object,long,long,int,long,long,MemberName)void/invokeStatic

Classes extending io.github.jwharm.javagi.base.FunctionPointer like org.gnome.glib.SourceFunc require memory in the CodeHeap whenever they create a new upcall stub. Instead of creating a new upcall stub for call to functions like GLib.idleAdd, a generic static upcall stub can be created and the MemorySegment passed to the upcall can be used to reference the actual SourceFunc, similar to how Arenas.close_cb works.

I've tested that manually replacing calls to GLib.idleAdd with this and avoid the OutOfMemoryException:

	private static final Map<Integer, SourceFunc> FUNCS = new HashMap<>();

	/**
	 * The upcall stub for the timeouAddSeconds callback method
	 */
	public static final MemorySegment NEXT_SOURCE_FUNC;

	// Allocate the upcall stub for the SourceFunc callback method
	static {
		try {
			FunctionDescriptor _fdesc = FunctionDescriptor.of(ValueLayout.JAVA_INT, ValueLayout.ADDRESS);
			MethodHandle _handle = MethodHandles.lookup().findStatic(SlideShowPlaylistSprite.class, "upcall",
					_fdesc.toMethodType());
			NEXT_SOURCE_FUNC = Linker.nativeLinker().upcallStub(_handle, _fdesc, Arena.global());
		} catch (NoSuchMethodException | IllegalAccessException e) {
			throw new RuntimeException(e);
		}
	}

	/**
	 * The {@code upcall} method is called from native code. The parameters
	 * are marshaled and {@link #run} is executed.
	 */
	public static int upcall(MemorySegment userData) {
		int hashCode = userData.reinterpret(ValueLayout.JAVA_INT.byteSize()).get(ValueLayout.JAVA_INT, 0);

		SourceFunc func = FUNCS.get(hashCode);
		if (func != null) {
			var _result = func.run();
			if (!_result) {
				FUNCS.remove(hashCode);
			}
			return _result ? 1 : 0;
		}
		return 0;
	}

	@FunctionalInterface
	@Generated("io.github.jwharm.JavaGI")
	public static interface SourceFunc {
		/**
		 * Specifies the type of function passed to {@link GLib#timeoutAdd},
		 * {@code GLib#timeoutAddFull}, {@link GLib#idleAdd}, and
		 * {@code GLib#idleAddFull}.
		 * <p>
		 * When calling {@link Source#setCallback}, you may need to cast a
		 * function of a different type to this type. Use {@code GLib#SOURCEFUNC} to
		 * avoid warnings about incompatible function types.
		 */
		boolean run();
	}

	static final MethodHandle g_timeout_add_seconds_full = Interop
			.downcallHandle(
					"g_timeout_add_seconds_full", FunctionDescriptor.of(ValueLayout.JAVA_INT, ValueLayout.JAVA_INT,
							ValueLayout.JAVA_INT, ValueLayout.ADDRESS, ValueLayout.ADDRESS, ValueLayout.ADDRESS),
					false);

	public static int timeoutAddSeconds(int priority, int interval, SourceFunc function) {
		try (var _arena = Arena.ofConfined()) {
			final Arena _functionScope = Arena.ofShared();
			int _result;
			try {
				int hashCode = _functionScope.hashCode();
				FUNCS.put(hashCode, function);
				_result = (int) g_timeout_add_seconds_full.invokeExact(priority, interval,
						(MemorySegment) (function == null ? MemorySegment.NULL : NEXT_SOURCE_FUNC),
						Arenas.cacheArena(_functionScope), Arenas.CLOSE_CB_SYM);
			} catch (Throwable _err) {
				throw new AssertionError(_err);
			}
			int _returnValue = _result;
			return _returnValue;
		}
	}

BwackNinja avatar May 08 '25 18:05 BwackNinja

Your proposed solution should work, but it will require large changes in a very complex part of the code generator. I'd really prefer not to go there... And I don't see why the current implementation wouldn't work:

  • GLib.idleAdd calls g_idle_add_full. Looking at the GIR file, the SourceFunc parameter has "notified" scope. This means the "notify" callback parameter is called immediately after the SourceFunc has finished.
  • In Java-GI, the "notify" callback is Arenas.CLOSE_CB_SYM, and it closes the arena of SourceFunc's upcall stub.
  • The JVM will deallocate the upcall stub when the arena is closed.

I think the logic is sound, so I can't really explain your OOM exception. Can you investigate if the CLOSE_CB_SYM is run for your SourceFuncs?

jwharm avatar May 08 '25 19:05 jwharm

I can take another look to reconfirm. I believe that CLOSE_CB_SYM is run properly because I had regular memory issues when dealing with lambdas rather than passing in static classes as SourceFunc. Regular heap usage isn't increasing significantly -- it's the CodeHeap specifically, which by default on my machine has maximums of 116MiB 'profiled-nmethods', 7.25MiB 'non-nmethods', and 116MiB 'non-profiled nmethods'. It normally takes much longer to fill those, which is why I shared the options for limiting them. I also ran with -XX:+ClassUnloading -XX:+UseCodeCacheFlushing in hopes that those caches would clear up, but they didn't.

BwackNinja avatar May 08 '25 20:05 BwackNinja

Can you provide a minimal reproducable testcase so I can investigate?

jwharm avatar May 10 '25 09:05 jwharm

I'm working on a testcase now. I have one that doesn't exhibit this problem, so it looks like this may be the symptom of a different issue and not leaking generally. I'm comparing to my failing case to see where it diverges.

I should be able to isolate it in the next couple of days.

BwackNinja avatar May 10 '25 17:05 BwackNinja

package com.bwackninja;

import java.lang.foreign.MemorySegment;

import org.gnome.gdk.Paintable;
import org.gnome.gdk.Snapshot;
import org.gnome.gio.ApplicationFlags;
import org.gnome.glib.GLib;
import org.gnome.glib.Type;
import org.gnome.gobject.GObject;
import org.gnome.gtk.Application;
import org.gnome.gtk.ApplicationWindow;
import org.gnome.gtk.Fixed;
import org.gnome.gtk.Picture;
import org.gnome.gtk.Window;
import org.gnome.pango.Context;

import io.github.jwharm.javagi.gobject.types.Types;

public class Test {
	public static class Canvas extends GObject implements Paintable {
		public Canvas(MemorySegment address) {
			super(address);
		}
		
		public static Type gtype = Types.register(Canvas.class);
		
		@Override
		public void snapshot(Snapshot snapshot, double width, double height) {
			try {
				if (snapshot instanceof org.gnome.gtk.Snapshot gsnapshot) {
					gsnapshot.save();
					gsnapshot.restore();
				}
			} catch (Exception e) {
				e.printStackTrace();
			}
		}
		
		public static Canvas create() {
			Canvas ret = GObject.newInstance(Canvas.gtype);
			return ret;
		}
		
		@Override
		public int getIntrinsicWidth() {
			return 1920;
		}
		
		@Override
		public int getIntrinsicHeight() {
			return 1080;
		}
	}

	public static void runApp(Application application) {
		var window = (Window) ApplicationWindow.builder()
				.setApplication(application)
				.setDefaultWidth(1920)
				.setDefaultHeight(1080)
				.build();
		var draw = Canvas.create();
		var img = new Picture();
		img.setPaintable(draw);
		img.addTickCallback(( _, _) -> {
			draw.invalidateContents();
			return GLib.SOURCE_CONTINUE;
		});
		Fixed fixed = new Fixed();
		fixed.put(img, 0, 0);
		img.setSizeRequest(1920, 1080);
		window.setChild(fixed);
		window.present();
	}

	public static void main(String[] args) {
		var app = new Application("com.bwackninja.Test",
				ApplicationFlags.DEFAULT_FLAGS);
		app.onActivate(() -> runApp(app));
		app.run(args);
	}
}

Running this with -XX:NonProfiledCodeHeapSize=10M -XX:ProfiledCodeHeapSize=10M -XX:NonNMethodCodeHeapSize=8M will complain about a full CodeHeap within 10 minutes. Commenting out draw.invalidateContents() stays stable. A GLib.idleAdd doesn't permanently add to the CodeHeap, but will fail being unable to allocate the upcall after the CodeHeap is full. I didn't hit this issue when I forgot to add the Fixed to the Window.

BwackNinja avatar May 12 '25 21:05 BwackNinja

I'm able to reproduce the issue, but haven't found what is clogging the CodeHeap yet. I used jcmd to repeatedly dump the size and contents of the CodeHeap into a text file until the OOM occured, but I didn't see anything out of the ordinary. The amount of free space fluctuates a bit, but doesn't show a clear trend, until after 8-10 minutes it suddenly goes to zero.

jwharm avatar May 14 '25 18:05 jwharm