Valkyrien-Skies-2
Valkyrien-Skies-2 copied to clipboard
dismount_dead_entities/MixinLivingEntity causing Server Thread lockup, degraded chunk load/unload
This issue occurs when only Valkyrien Skies and addons are installed and no other mods
- [X] I have tested this issue and it occurs when no other mods are installed
Minecraft Version
1.20.x
Mod Loader
Forge
Issue description
During chunk load/unload, mobs with passengers while unloading can force the server thread to freeze, force loading the chunks that are unloading to perform a dismount check. This can cause multi-second freezes and, in the worst case, a server crash.
This also severely degrades the performance of server chunk loading/unloading, significantly delaying chunk generation and overall straining the server's MSPT.
This can also cause the server to hang at shutdown indefinitely as the getChunk calls never succeed.
Issue reproduction
Load/unload chunks in a world with mobs with passengers and watch profiling tools to see the server hang. I plan to release my HealthMonitor mod soon to make diagnosing these issues easier, but all it does is do some basic automated profiling of the Server Thread.
Logs
https://gist.github.com/codeHusky/844a4a6044f96c7bc8e475ad8273b328
Shutdown Deadlock https://gist.github.com/codeHusky/196fe35e5903779a8882f4be19df3594
Same with the Valkyrien Skies 2 (2.1.2-beta.1) (& Clockwork 0.1.13, but still the same without the addon), but for 1.19.2, both Forge and Fabric.
If someone will find out this issue, there are also massive issues with dimension mods. When VS2 is installed, there are several potential mod issues can arise. The major one is with the any Nether-related mod. I can reproduce the same issue with VS2 + BetterNether/Cinderscapes/any other Nether mod and even with nearly any dimension mod I could test out from Modrinth/CF.
When I'm teleporting to the Nether through command for the first time or through the Nether portal -> TPS seems ok, no server degraded performance, everything works as expected. When leaving any dimension except the Overworld -> TPS drops to 0.6, server gets overloaded and deadlocks itself through dismount_dead_entities/MixinLivingEntity$preDismountVehicle()
function.
Spark shows this stacktrace. (0.6 TPS, 1375+ MSPT, VS2 callback stuck)
Seems that when a mixin tries to load/unload the chunk, it never unloads and makes the server stutter a lot. I'm not currently sure, but probably something wrong is there: https://github.com/ValkyrienSkies/Valkyrien-Skies-2/blob/97bf648df1c3200dc5dcd4e94086d516113b736c/common/src/main/java/org/valkyrienskies/mod/mixin/feature/dismount_dead_entities/MixinLivingEntity.java#L39
The more degraded performance is detected when the Clockwork addon is installed.
Having the same issue with my modpack with 425 mods, but no success of binary search was done until VS2 was removed entirely from the pack with addons. No issues was there when VS2 removed. Also to note, that Singleplayer works just as expected, the problem is only found on dedicated servers.
Hi, I have an issue very similar to this in 1.19.2 and Vs clockwork. Are you sure there are no ways of resolving this on a server? These mods are like the core of the gameplay and these crashes are problematic.
Hello ! I noticed the exact same problem and it is also a core mode for my modpack https://spark.lucko.me/pdhyk3nfxu
Removing the mixin solved my issue, now the server is working as intended for a week. At the moment I haven't found any proper reimplementation of the mixin to not make it stuck, but seems to work just fine with dead entities even without this mixin at all.
Same issue here, very similar spark profile: https://spark.lucko.me/ur6cHkPR7N
Currently compiling a version with that mixin temporarily removed, will report back if that fixes the lag issue
The version crashed, presumably because the main branch has changes that the latest released Eureka version isn't currently updated to. As such I am unable to test
As a hack, you can "disable" the mixin by opening the Valkyrien Skies jarfile with any archive tool (e.g. 7-zip), opening valkyrienskies-common.mixins.json
, removing the "feature.dismount_dead_entities.MixinLivingEntity",
line, then finally replace the file in the jarfile.
That's what I'm doing right now. Hopefully it works
As far as I can tell, my issue is fixed with that mixin disabled (of course, disabling the mixin probably causes other issues). While flying lots around the nether with the mixin enabled, the server chugs down to a literal 1 TPS. With the mixin disabled, it holds 20 tps.
Likewise, can confirm. Looking at the stacktrace from my custom watchdog, the issue lies here: https://github.com/ValkyrienSkies/Valkyrien-Skies-2/blob/97bf648df1c3200dc5dcd4e94086d516113b736c/common/src/main/java/org/valkyrienskies/mod/mixin/feature/dismount_dead_entities/MixinLivingEntity.java#L35
The entities' setRemoved
methods (which then call dismount
) are called AFTER the chunk has been unloaded, so getBlockState
has to load the chunk again to find the blockstate.
This happens for every single entity that is ever loaded, locking up the server slowly over time.