Paper icon indicating copy to clipboard operation
Paper copied to clipboard

Simple broadcast plugins using sendMessage() drops TPS with 1 or 2 players

Open EasyMochi opened this issue 1 year ago • 5 comments

Expected behavior

Using sendMessage() looped through all players should not cause the TPS to drop. (99% sure the issue comes from Paper, as this was tested with multiple plugins)

Observed/Actual behavior

any plugin that uses sendMessage() to all players drops the TPS of the server drastically, creating a 2 seconds "freeze" on the main thread, and show that the server is overloaded and running ~50 ticks behind. Sparks does not show said plugin in the report, at all, but removing the plugin(s) using this method fixes the TPS drop. (If there are no players but the plugin is configured to still loop through players and try to send messages, the server will still freeze for 2 seconds)

The weird thing is that this seems to only happen when the messages are sent every 5 minutes or more. Could this be a GC issue? Am I doing something wrong? A lot of plugins are using automated messages to all players like that, and they all cause the same TPS drop according to my tests.

Steps/models to reproduce

  1. Use an auto-broadcast plugin or similar plugin using the method below, or any similar way to achieve this. (Using multi-line components?)
  2. Wait until a message is sent, and the TPS will drop.
  3. The method execution needs to be spaces by approximately 5 minutes, if it repeats itself every 10 seconds or so, the drop does not seem to happen.

Plugin and Datapack List

This is the method causing the TPS drop when any number of players are online. (Even 1 or 2) This is also happening with other plugins using similar methods, and I could not find anything about it. I am very unsure where to ask about it, but I assume it must be an issue with Paper due to it being experimental at the moment.

private void startBroadcastTask() {
        if (broadcastMessages.isEmpty()) return;
        messageIterator = broadcastMessages.listIterator();

        broadcastTask = Bukkit.getScheduler().runTaskTimer(this, () -> {
            Collection<? extends Player> onlinePlayers = Bukkit.getOnlinePlayers();

            if (!onlinePlayers.isEmpty()) {
                if (!messageIterator.hasNext()) messageIterator = broadcastMessages.listIterator();
                List<Component> messageComponents = messageIterator.next();

                onlinePlayers.forEach(player -> {
                    messageComponents.forEach(player::sendMessage);
                    if (playSound) player.playSound(player.getLocation(), broadcastSound, 1.0f, 1.0f);
                });
            }
        }, 0, broadcastDelay);
    }

Paper version

version [16:04:44 INFO]: Checking version, please wait... [16:04:44 INFO]: This server is running Paper version 1.21-71-master@aa929d6 (2024-07-15T15:14:07Z) (Implementing API version 1.21-R0.1-SNAPSHOT) You are running the latest version

Other

I am deeply sorry if this isn't the place to report this, but I am running out of ideas. I tried many different broadcast plugins, made my own simple one like mentioned above, and they all do the same thing. This has been going on for a couple of versions.

About the snippet from above, the only other methods are onEnable() onDisable() and a method to load the config file that only runs once when the plugin loads.

It consistently freezes the main thread for around 2000ms to 2500ms, not less, not more. It happened on the current server I am using, as well a previous one using a completely different set of plugins.

Like mentioned above, removing any plugin sending messages with this kind of method (sendMessage() looped through online players) fixes the issue.

There are no logs, aside from the TPS drop "Server Overloaded" message, and Sparks does not show anything on any plugin aside from the TPS drop on the graph.

IMPORTANT NOTE: It could be that using multi-line components is the issue.

EasyMochi avatar Jul 15 '24 16:07 EasyMochi

serialisation of message components occurs off the main thread, so, it's very unlikely that it's that that is causing it, but, https://spark.lucko.me/docs/guides/Finding-lag-spikes is how you'd use spark to try to isolate down to lag ticks

electronicboy avatar Jul 15 '24 16:07 electronicboy

Currently running the profiler with recommended settings, I will get back to you in a moment, it takes time as it only happens every 5 minutes. But currently the tickmonitor is consistent with the report I made: (Ignore the GC with 121ms)

[16:42:20 INFO]: [⚡] Tick #7522 lasted 1636.52 ms. (3173.07% increase from avg)
[16:42:32 INFO]: [⚡] Tick #7783 included GC lasting 54 ms. (type = Young Gen)
[16:42:42 INFO]: [⚡] Tick #7980 included GC lasting 29 ms. (type = Young Gen)
[16:42:42 INFO]: [⚡] Tick #7988 included GC lasting 38 ms. (type = end of concurrent GC pause)
[16:42:42 INFO]: [⚡] Tick #7993 included GC lasting 0 ms. (type = end of concurrent GC pause)
[16:45:22 INFO]: [⚡] Tick #11192 included GC lasting 12 ms. (type = Young Gen)
[16:46:05 INFO]: [⚡] Tick #12053 included GC lasting 42 ms. (type = Young Gen)
[16:47:04 INFO]: [⚡] Tick #13234 included GC lasting 16 ms. (type = Young Gen)
[16:47:05 INFO]: [⚡] Tick #13242 included GC lasting 15 ms. (type = end of concurrent GC pause)
[16:47:05 INFO]: [⚡] Tick #13246 included GC lasting 1 ms. (type = end of concurrent GC pause)
[16:47:20 INFO]: [⚡] Tick #13522 lasted 1478.82 ms. (2857.68% increase from avg)
[16:47:32 INFO]: [⚡] Tick #13785 included GC lasting 40 ms. (type = Young Gen)
[16:48:35 INFO]: [⚡] Tick #15048 included GC lasting 23 ms. (type = Young Gen)
[16:51:04 INFO]: [⚡] Tick #18019 included GC lasting 19 ms. (type = Young Gen)
[16:51:04 INFO]: [⚡] Tick #18026 lasted 121.71 ms. (143.42% increase from avg)
[16:51:04 INFO]: [⚡] Tick #18027 included GC lasting 84 ms. (type = end of concurrent GC pause)
[16:51:04 INFO]: [⚡] Tick #18032 included GC lasting 0 ms. (type = end of concurrent GC pause)
[16:51:32 INFO]: [⚡] Tick #18584 included GC lasting 16 ms. (type = Young Gen)
[16:52:20 INFO]: [⚡] Tick #19522 lasted 1570.65 ms. (3041.34% increase from avg)
[16:52:20 INFO]: [⚡] Tick #19522 included GC lasting 25 ms. (type = Young Gen)
[16:52:58 INFO]: [⚡] Tick #20316 included GC lasting 7 ms. (type = Young Gen)
[16:52:59 INFO]: [⚡] Tick #20324 included GC lasting 15 ms. (type = end of concurrent GC pause)
[16:52:59 INFO]: [⚡] Tick #20328 included GC lasting 0 ms. (type = end of concurrent GC pause)
[16:55:41 INFO]: [⚡] Tick #23557 included GC lasting 11 ms. (type = Young Gen)
[16:56:22 INFO]: [⚡] Tick #24388 included GC lasting 52 ms. (type = Young Gen)
[16:57:04 INFO]: [⚡] Tick #25230 included GC lasting 16 ms. (type = Young Gen)
[16:57:05 INFO]: [⚡] Tick #25239 included GC lasting 38 ms. (type = end of concurrent GC pause)
[16:57:05 INFO]: [⚡] Tick #25244 included GC lasting 0 ms. (type = end of concurrent GC pause)
[16:57:20 INFO]: [⚡] Tick #25522 lasted 1464.78 ms. (2829.6% increase from avg)
[16:57:31 INFO]: [⚡] Tick #25776 included GC lasting 41 ms. (type = Young Gen)

EasyMochi avatar Jul 15 '24 17:07 EasyMochi

Alrighty, so I've been running the profiler for 40 minutes or so, and the actual profiler (/spark profiler open) is empty.

I used /spark profiler --only-ticks-over 1000 according to the documentation, since my spikes are always over 1000ms per tick.

The profiler itself shows "No Data", and only the tick monitor shows what I sent above in the console and the in-game chat.

It happens consistently, emphasis on consistently, when the method I sent in the original post is executed. And again, any plugin using the same kind of method will cause the exact same issue.

Is there anything I can do to help make sure it is not an issue with my setup? I am using Oracle's JDK 21 LTS if that helps, and our machine is a Debian server with a Xeon E5-1620v2 and 32GB of RAM. It isn't a VPS, the whole rack is ours. I know the CPU is quite old, but I really don't think this could be the issue. I also assigned 16GB of heap to the JVM, and I am using the recommended flags from the docs.

EasyMochi avatar Jul 15 '24 17:07 EasyMochi

does this occour with just that simple code and no other plugin? I am suspecting that you have another plugin that hooks into message sending and does stuff, like something with placeholders or so.

MiniDigger avatar Jul 16 '24 06:07 MiniDigger

Like I mentioned above, the same behaviour was observed on a completely different server with different plugins, with the only thing in common being Paper 1.21. Aside from Essentials, most of the plugins are custom made and pretty simple, nothing related to chat at all.

EasyMochi avatar Jul 16 '24 07:07 EasyMochi

Is this still an issue for you on latest? I have been running a local server with a broadcast plugin like this and have yet to run into your lag spike. Additionally, it does not seem like other users have experienced said issue either, making it rather difficult for us to investigate.

lynxplay avatar Jul 30 '24 20:07 lynxplay

I have given this a last attempt, have not been able to replicate it. Feel free to post here again if you have any more hints for us as to what it could be or how we could replicate it. I'll close this until then as cannot-reproduce, sorry.

lynxplay avatar Aug 04 '24 20:08 lynxplay