serverlist icon indicating copy to clipboard operation
serverlist copied to clipboard

Stop penalizing servers for running up-to-date code

Open Warr1024 opened this issue 4 years ago • 6 comments

Server list is sorted (at least partially) based on uptime, which penalizes servers that restart to stay up-to-date with current mods/games, and encourages servers to continue to run stale versions in order to stay higher on the listing. Restarting a server should not push it to the bottom of the list.

Instead, it would be better to track availability, e.g. each time a poll is done, record whether the server was on the list and responded to the ping or not. Compute availability stats based on a time window, such as the past hour or day or week, and use that in place of uptime when determining ranking. Uptime can still be captured as a metric but should not be used in the sorting algorithm.

Warr1024 avatar Jul 07 '20 14:07 Warr1024

maybe add penalty if uptime < 1h ? no interest to restart every hour for players

nerzhul avatar Jul 08 '20 13:07 nerzhul

I'm not convinced though that ANY restart penalty policy is useful. I don't see what kind of advantage blindly restarting your server all the time offers to server admins, so it doesn't seem like a thing that would need to be discouraged. Frequent restarts are also not necessarily more annoying than being forced to play with a critical bug that a fix is available for just because of a minimum uptime policy. We already have a popularity metric, so any server behavior users find annoying will be reflected in that. The value of uptime/availability metrics is just so players can find servers where the operators appear to be committed to actually keeping their servers running and not get sucked into a bunch of fly-by-night servers that will just disappear soon anyway.

If we do end up deciding to apply some kind of score penalty for restarts, we need to take into account long periods of continuous uptime, and not just look at current server uptime. My server has a policy of requiring 2 hours of uptime after the previous restart, unless manually overridden, before the next auto-restart for updates, and it should not have to pay any penalty for the hour after its restart, unless the restart before THAT was less than an hour earlier. This might be feasible to implement, but I don't think it's worth either the complication to implement, nor the bikeshedding it will probably lead to...

Warr1024 avatar Jul 08 '20 16:07 Warr1024

It's beneficial as a way to avoid players joining servers in boot loops or that are unstable. But time since last start is a poor metric for this, number of restarts in the last hour would be better. This requires some form of storage, however.

rubenwardy avatar Dec 24 '20 23:12 rubenwardy

Keeping a few timestamps in memory suffices so here we go: 5d5f31d295d8ed3b94a70a4470e37481facd6f9d

sfan5 avatar Mar 15 '21 12:03 sfan5

It turns out that number of restarts in the last hour is not as good as once assumed, because it doesn't take into account whether the restarts are actually harmful or beneficial to player experience.

Since this issue was originally examined, I have run into some usability issues with forcing players to wait for 2 hours for the next restart, and so have made some changes. The 2 hour limit is only enforced if players are actually online, and if the server becomes empty, the restart is performed immediately. Now players are able to unanimously agree to rejoin in order to trigger an early restart and get access to the latest code without waiting, plus it's also more likely this way that when the first player logs in the server is already up to date and they don't have to contend with a restart countdown ... but this improves the player experience while it also worsens the server score.

To handle this properly, the server list would probably need to at least:

  • Estimate the impact to players NOT in the server of the restart, e.g. total amount of downtime over the past time window, to estimate the probability that a player attempting to join it would find it unavailable.
  • Estimate the impact to players in the server of the restart, e.g. the number of players who were online some time before the shutdown.

These are probably doable using the existing infrastructure, and just keeping a few more stats in-memory and expanding the heuristic function. Basically keep track of last sampled player count, and don't record a restart if (1) the last player count was zero, and (2) the total downtime is less than a certain limit (e.g. 1 minute, or the sampling period).

To really estimate server instability, it might make the most sense to try to capture intentional vs. unexpected shutdowns, but that would require expanding the communication protocols, e.g. having the server have to positively signal intent to shutdown/restart to avoid being marked as a crash.

Warr1024 avatar Aug 09 '21 15:08 Warr1024

To really estimate server instability, it might make the most sense to try to capture intentional vs. unexpected shutdowns, but that would require expanding the communication protocols, e.g. having the server have to positively signal intent to shutdown/restart to avoid being marked as a crash.

This already exists: the server sends a request with "action": "delete" on clean shutdown.

ShadowNinja avatar Dec 03 '21 04:12 ShadowNinja