HolographicDisplays
HolographicDisplays copied to clipboard
Make BaseHologramManager thread-safe and guard against nulls
This attempts to fix NullPointerException
bubbling on ChunkUnloadEvent
by introducing null-check in BaseHologramManager#addHologram
and making BaseHologramManager#holograms
collection synchronized to further harden the API.
The PR is intended to supersede attempted null-checks in each method from #631. Currently, a draft as I'm testing if the issue is fixed and that seems to be a non-trivial task due to the nature of the problem. At the moment I'm not really sure what is the cause.
[03:25:12 ERROR]: Could not pass event ChunkUnloadEvent to HolographicDisplays v3.0.0-SNAPSHOT-b205
java.lang.NullPointerException: Cannot invoke "me.filoghost.holographicdisplays.plugin.hologram.base.BaseHologram.onChunkUnload(org.bukkit.Chunk)" because "hologram" is null
at me.filoghost.holographicdisplays.plugin.hologram.base.BaseHologramManager.onChunkUnload(BaseHologramManager.java:75) ~[HolographicDisplays.jar:?]
at me.filoghost.holographicdisplays.plugin.listener.ChunkListener.onChunkUnload(ChunkListener.java:62) ~[HolographicDisplays.jar:?]
at com.destroystokyo.paper.event.executor.asm.generated.GeneratedEventExecutor45.execute(Unknown Source) ~[?:?]
at org.bukkit.plugin.EventExecutor.lambda$create$1(EventExecutor.java:75) ~[purpur-api-1.18.2-R0.1-SNAPSHOT.jar:?]
at co.aikar.timings.TimedEventExecutor.execute(TimedEventExecutor.java:76) ~[purpur-api-1.18.2-R0.1-SNAPSHOT.jar:git-Purpur-1632]
at org.bukkit.plugin.RegisteredListener.callEvent(RegisteredListener.java:70) ~[purpur-api-1.18.2-R0.1-SNAPSHOT.jar:?]
at org.bukkit.plugin.SimplePluginManager.callEvent(SimplePluginManager.java:671) ~[purpur-api-1.18.2-R0.1-SNAPSHOT.jar:?]
at net.minecraft.world.level.chunk.LevelChunk.unloadCallback(LevelChunk.java:887) ~[?:?]
at net.minecraft.server.level.ChunkHolder.lambda$updateFutures$8(ChunkHolder.java:575) ~[?:?]
at net.minecraft.server.level.ChunkMap$CallbackExecutor.run(ChunkMap.java:200) ~[?:?]
at net.minecraft.server.level.ChunkHolder.updateFutures(ChunkHolder.java:585) ~[?:?]
at net.minecraft.server.level.DistanceManager.runAllUpdates(DistanceManager.java:246) ~[?:?]
at net.minecraft.server.level.ServerChunkCache.runDistanceManagerUpdates(ServerChunkCache.java:830) ~[?:?]
at net.minecraft.server.level.ServerChunkCache.tick(ServerChunkCache.java:904) ~[?:?]
at net.minecraft.server.level.ServerLevel.tick(ServerLevel.java:650) ~[?:?]
at net.minecraft.server.MinecraftServer.tickChildren(MinecraftServer.java:1621) ~[purpur-1.18.2.jar:git-Purpur-1632]
at net.minecraft.server.dedicated.DedicatedServer.tickChildren(DedicatedServer.java:522) ~[purpur-1.18.2.jar:git-Purpur-1632]
at net.minecraft.server.MinecraftServer.tickServer(MinecraftServer.java:1483) ~[purpur-1.18.2.jar:git-Purpur-1632]
at net.minecraft.server.MinecraftServer.runServer(MinecraftServer.java:1247) ~[purpur-1.18.2.jar:git-Purpur-1632]
at net.minecraft.server.MinecraftServer.lambda$spin$1(MinecraftServer.java:321) ~[purpur-1.18.2.jar:git-Purpur-1632]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
Wild stuff!
> > server.pluginManager.getPlugin("HolographicDisplays").v2HologramManager.holograms
[04:05:23 INFO]: $> [
Hologram{position=Position{worldName=world, x=189.07263782238957, y=68.3, z=-9.758278752205966}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=1207.6946531926249, y=63.3, z=-1848.9788851634933}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=221.31690475473889, y=75.8, z=11.300000011920929}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=220.9262401297916, y=70.2375, z=23.084974923072437}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=224.5, y=70.3, z=22.5}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=243.50777416170928, y=106.3, z=960.1811414732996}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=-678.8538744560602, y=-12.75949474255298, z=1664.0480630244513}, lines=[TextLine{text=⚔ 1.00}], deleted=true},
null,
Hologram{position=Position{worldName=world, x=-4.300000011920929, y=96.3, z=-1685.512499988079}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=1737.487500011921, y=76.3, z=1851.0782310321765}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=216.95655397764685, y=70.2375, z=30.68788966676058}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=222.76982676406578, y=77.78839775014912, z=25.514431758331156}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=-334.4805594241093, y=9.3, z=-1718.7752759523817}, lines=[], deleted=false},
Hologram{position=Position{worldName=world, x=-1000.4945456248589, y=-53.7, z=292.33610733412524}, lines=[ItemLine{itemStack=ItemStack{BARRIER x 1}}, TextLine{text=00:22:20}], deleted=false},
Hologram{position=Position{worldName=world, x=3387.0508828352545, y=76.3, z=557.5561918167947}, lines=[ItemLine{itemStack=ItemStack{BARRIER x 1}}, TextLine{text=00:08:34}], deleted=false}
]
Quick temp fix using okaeri-poly:
// works for v3.0.0-SNAPSHOT (beta 3)
// groovy cannot into direct field access in super
def c=me.filoghost.holographicdisplays.plugin.hologram.base.BaseHologramManager;
def m=server.pluginManager.getPlugin("HolographicDisplays").v2HologramManager;
def f=c.declaredFields[0];
f.setAccessible(true);
f.get(m).remove(null);
Tested this PR. Works like a charm!
My instances were running stable ever since, so at least this isn't breaking something. Converting to normal PR then.
Thanks for this, but I prefer a different approach: making sure that the calling thread is the main thread. The API is not designed to be used asynchronously, other things will be break if used that way.
The null check needs to be there, but I wonder if there is a null because a hologram is created asynchronously and the list "breaks", or something is actually adding a null element.
This issue is really a goose chase. But I believe that making BaseHologramManager thread-safe will be much better for backwards compatibility. I'm also convinced the main thread check here will break multiple plugins when there is no need to, assuming the cause was actually a result of an asynchronous call.
I'm also a bit confused what would have caused this in my codebase, as I don't think I had any asynchronous calls to the plugin's API, nor I use any 3rd-party plugin that integrate with it. However, I might have missed something (especially since it was 2 months ago) and concurrent access still seems as the most likely cause.
After further checking, there are only two places that use the addHologram method and they both add a hologram they directly create, which cannot be null. Both the new and the old API have a check that prevents creating holograms asynchronously, so I cannot figure out how it's possible for that to happen (unless you're bypassing the main API to call those methods asyncrhronously).
I think the cause is one of the "delete" methods that is called asynchronously, since those do not have any check.
I've added more checks in https://github.com/filoghost/HolographicDisplays/commit/777b696f3881d1d27dc6ba249bb904aff9b9e3da, please try again to see if we can catch the issue earlier.
I think I can confirm this one to be the case without testing. I recall occasionally seeing stacktraces related to this on my test server after being AFK awhile and then moving. Turns out I have AFK holograms and this is the only place where holograms API is called asynchronously, it has a couple of nested methods, but one of the callers is async scheduled timer.
It wasn't that direct, so I missed it before, as it basically goes from periodic task into update routine, then update routine has the case where it goes into deletion handler which is called in most of the cases synchronously.
Just updated my code for this case, but I would strongly recommend synchronization as per my PR anyway for the compatibility reasons, and would add a warning (reason + stacktrace) to these calls if you really want to prevent async delete calls in the future.
Async calls has never been supported, a check was simply missing. Compatibility is needed when changing features, not when fixing errors.
The API might support async calls in the future, but many things would need to be updated to work properly.