Skript icon indicating copy to clipboard operation
Skript copied to clipboard

Skript Variable locking, major lag spikes every 5 or 10 minutes

Open Plutoren opened this issue 7 years ago • 15 comments

Skript version: 2.2-dev36 Paper Spigot Version: latest as of today August 18th (1.12.2)

It seems Skript locks it’s variable file for about 15 seconds every 5 minutes, also causing the server to hang as it waits to read/write to this file. It does not ALWAYS lag every single 5 minutes, I like to think it doesn’t lag if it happens to not be writing or reading any variables during one of the 15 second lock periods. But during busy hours it lags literally every 5 minutes.

Thanks to this latest version of paper, I can show you a thread dump whenever the server hangs. I have discovered the source of my lag spikes, occurring every 5 minutes, is Skript itself, I have also tried disabling all of my addons and found the spikes continued.

I apologize in advance for using a phone for this, my laptop was recently stolen

I have tried disabling several of my more “intensive” scripts but again the issue continues.

I have also tried disabling several other plugins but again no luck.

I also talked to Aikar who made /timings and he helped me optimize my garbage collection + Java flags but again no effect on this issue

This issue has been ongoing for at least two or more months so I would very much appreciate a fix.

This is a strong dedicated server for the record

Here are the plugins I was using when I removed all of my addons, when the issue continued: https://gyazo.com/16f296ba1b8ec900a3e9ce390c10815d

Here is the thread dump provided by paper due to my sever hanging for 10 seconds, and another dump from a continued 15 second hang.

https://gyazo.com/b57332e654ac57c9315cbd0cf280f968 https://gyazo.com/d02dd4cdac2942801c4af5490c1782f2 https://gyazo.com/9cf7d66ad109aea1170aacc88342968a https://gyazo.com/f77deacdb051b986249efca11bdbbc25

If you need to know anything else or have any ideas on how I can fix this, please let me know.

Plutoren avatar Aug 19 '18 04:08 Plutoren

Allow me to ask the obvious question: how large is your variables file?

jaylawl avatar Aug 19 '18 16:08 jaylawl

After about a year of growth it has reached around 900k variables or 0.171644776gb file size, this has been a steady and not abnormal growth as I use Skript fairly heavily

Plutoren avatar Aug 19 '18 17:08 Plutoren

Current variable system does not handle huge databases well. It loads everything to memory and occasionally saves that everything to disk, if I remember correctly. That is of course not done in main server thread, but attempting to set variables during the save operation will have to wait until saving is done. That is probably blocking the server thread in your case.

Improving variable system is planned, but will not be there anytime soon. Currently, the best solution is to reduce the amount of variables. Remember to delete variables that you're no longer using!

bensku avatar Aug 19 '18 20:08 bensku

@bensku On that note... I have thought about this a few times. Would a system where EACH skript had its own (individual) variable file make sense and help a situation like that? I was thinking of suggesting it a while ago, but didn't know if something like that would do any good for the server.

ShaneBeee avatar Aug 19 '18 20:08 ShaneBeee

That doesn't really make sense in practice since I imagine most people use tons of variables which are referenced across multiple files. I do, at least.

TheBentoBox avatar Aug 19 '18 20:08 TheBentoBox

Ahhh yes... good point Bento, duh! I didnt really think of that!

ShaneBeee avatar Aug 19 '18 20:08 ShaneBeee

Yep I’ve come to this conclusion myself after the first question was asked, I did some digging, it seems whenever you call set variable, skript saves a temporary variable at the bottom of the csv file, it doesn’t even check if that same variable is in the temporary list meaning if you overwrite a variable it just creates duplicates, then every 5 minutes it takes this temporary list and sorts it into the main list, showing as I was trying to sort literally thousands of temporary variables into a list of almost a million other variables, this was a problem lol, I’ve begun trimming my variables and got it down to around 500k, no coincidence at all my lag spikes cut from 15 seconds to about 6-10 seconds, additionally I’ve been optimizong how many variables are frequently set, since my temporary list is growing to thousands of variables in the five minute intervals before it sorts again.

I think a useful solution to my problem would be a config setting to determine how often skript saves and sorts the temporary list into the main list, I would rather sort 200 variables every minute than several thousand every 5-10 minutes.

Or perhaps a variable limit, say it should try to clear the temp list if it reaches 200+ variables?

Additionally a debug message when this sort is happening would be useful, possibly containing how much is being sorted.

Although this problem is not entirely fixed it seems we have narrowed down the source, a patch or rewrite would be incredibly appreciated. Thank you for the help.

Plutoren avatar Aug 19 '18 21:08 Plutoren

@Plutoren maybe this will help you trim further: any data that you do not need to persist over server restart, you can add to entities and even blocks as metadata values. Check the docs out for those

jaylawl avatar Aug 19 '18 21:08 jaylawl

Since variable integrity is not guaranteed in new alpha builds, I decided to make somewhat experimental change that might (or might not) help with this issue. In practise, setting variables should not block in any case. Instead, prolonged CSV file save just slowly makes the variable system less efficient; I'm hoping that this will not be noticeable.

bensku avatar Sep 15 '18 17:09 bensku

Was this fixed in the end?

Snow-Pyon avatar Dec 05 '18 08:12 Snow-Pyon

I am not aware of a fix but would be willing to update if there was one, I am a few versions behind due to compatibility issues but would be willing to lose some features in order to fix this issue that is still quite the issue.

Plutoren avatar Jan 18 '19 01:01 Plutoren

Skript locks the file while it reads and/or writes so nothing tampers with it aka you, while it's handling the file. The locking has nothing to do with lag, just an excessive amount of variables.

I was talking to the poster on Discord about this issue and was linked.

Variable setting should not be dependent on saving. If the file is already saving let it finish. The setting variable should be set in the new map and then saved in the next backup interval.

TheLimeGlass avatar Jan 26 '19 08:01 TheLimeGlass

Not talking about backups, talking about sorting of new variables on a 5 minute interval. I made a picture to help people visualize how the csv file is organized. In my specific case, the csv is very large, and the red section that needs to be sorted gets pretty big after 5 minute during peak hours. Here is the actual code I am referring to.

saveTask = new Task(Skript.getInstance(), 5 * 60 * 20, 5 * 60 * 20, true) {
            @Override
            public void run() {
                if (changes.get() >= REQUIRED_CHANGES_FOR_RESAVE) {
                    try {
                        Variables.getReadLock().lock();
                        saveVariables(false);
                        changes.set(0);
                    } finally {
                        Variables.getReadLock().unlock();
                    }
                }
            }
        };

The red section needs to be sorted into the main section and the above code triggers this sort on a 5 minute interval. As you can see it locks the variables while it sorts it. This does not directly cause lag, however attempting to set any variable during the lock period will cause the main thread to wait for the variables to unlock. With a very large variables file the sort itself can take a couple seconds even on a very high end dedicated server. Since I am constantly setting variables for various reasons, the lag is pretty much guaranteed to happen during every sorting event because I attempt to set a variable while the sort is occurring, especially during peak times when lots of variables are being set.

Plutoren avatar Jan 26 '19 08:01 Plutoren

@Plutoren is correct. Recent versions of Skript have some mitigations against this. However, in retrospect, I think they could've been implemented better. Still - if you're willing to test live (take backups, as always when installing a major Skript update) - I would be interested to hear if it helps.

bensku avatar Jan 26 '19 10:01 bensku

I finally updated from dev-36 to 2.3.2 about a week ago, and after rewriting thousands of lines of code that were broken by the update (like || lores, and tons of alias changes) I had a couple chances to observe the performance improvements of the update during peak times.

Overall performance has improved despite not lowering my variable count. Upon further investigating how Skript now handles variable saving it seems that each variable is saved and sorted directly into the csv file, removing the need for a sort/store every 5 minutes.

However despite the fact that it appears there is no heavy task on the 5 minute intervals, I still see that my server has minor lag spikes on the same 5 minute intervals, they are hardly noticeable in-game anymore which is great, but I can still see the spikes in the timings report that seem to occur in the same pattern as before. I am curious as to what other tasks Skript is performing on a 5 minute interval that might not be apparent to me just by looking at the variables file.

On a side note I also noticed that deleting variables are not registered instantly. I have a chat games Skript that asks players simple questions like what's 5+5, and if their message matches the answer variable it deletes the answer variable so that no one else can answer it. In theory it should not allow more than 1 person to answer because the variable is deleted as soon as someone answers it correctly, and I had been running this code for months before the update without issue. However after the update I constantly saw 2-3 players getting a prize from the chat games meaning that the answer variable was not being deleted fast enough. I am sure this is related to the variable storage issue so I thought I would add it here instead of opening a brand new issue. I patched my chat games by adding a time variable when something is answered but clearly there is a minor issue here related to the updates.

Plutoren avatar Feb 05 '19 20:02 Plutoren