OTS Statistics by kondra ([email protected])
I've got permission from kondra ([email protected]) to release this code. He only requested to leave information about author kondra ([email protected]) in commit message.
It's well known 'OTS stats' system. It tracks all dispatcher thread tasks, Lua function calls and SQL queries.
It generates files dispatcher.log, lua.log and sql.log in data/logs/stats that looks like:
[29/01/2023 16:08:37]
Thread: 1 Cpu usage: 0.628848% Idle: 99.6671% Other: -0.295933% Players online: 0
Time (ms) Calls Rel usage % Real usage % Description
167 120 88.80196% 0.55843% std::bind(&Game::checkDecay, this)
20 300 10.83945% 0.06816% std::bind(&Game::checkCreatures, this, (index + 1) % EVENT_CREATURECOUNT)
0 12 0.30616% 0.00193% std::bind(&Game::updateWorldTime, this)
Report interval is configurable in config.lua.
It also reports slow/very slow function calls in files dispatcher_slow.log, lua_slow.log and sql_slow.log. Ex. server start takes over 50 ms, it's reported as very slow dispatcher_very_slow.log:
[29/01/2023 16:06:07] Execution time: 466 ms - std::bind(mainLoader, argc, argv, &serviceManager) - main
Lua startup.lua executes over 10 ms, it's reported as slow:
[29/01/2023 16:06:07] Execution time: 35 ms - data/globalevents/scripts/startup.lua:onStartup -
SQL took over 10 ms, it's reported as slow:
[29/01/2023 16:06:07] Execution time: 24 ms - TRUNCATE TABLE `towns` - TRUNCATE TABLE `towns`
Slow/very slow time is configurable in config.lua
There are also special.log, special_slow.log and special_very_slow.log, which report your custom statistics. Ex. to report decay time of items, in game.cpp under:
void Game::internalDecayItem(Item* item)
{
add:
AutoStat stat("internalDecayItem", std::to_string(item->getID()));
String internalDecayItem will be used for total CPU usage report. Second parameter (item ID) will be reported in slow and very_slow logs. In this case, it would say which item ID decayed over 10 ms (single item decay time, not all items with this item ID).
Most OTSes pass player name as second parameter. It let them track which player generated 'slow' actions - lagged OTS.
It's MR draft
There are few things to test:
- how
addEventLua functions are reported (is itUnknownor prints whole function Lua code), when you pass anonymous function ex.:
addEvent(function(x) print(x) end, 100)
- how dispatcher tasks using lambda functions are reported (is it
Unknownor prints whole function C++ code) ex.:
g_scheduler.addEvent(createSchedulerTask(OUTPUTMESSAGE_AUTOSEND_DELAY.count(), [&]() { sendAll(bufferedProtocols); }));
- does it compile on Windows
Feature implementation could also be better separated using STATS_ENABLED definition. Right now, when you compile it without STATS_ENABLED, it still pass function name for every dispatcher task, but it passes empty string:
https://github.com/gesior/forgottenserver-gesior/blob/0ad4be6cab3e335e6f7dc767722f70ee0c849be8/src/stats.h#L17-L23
This is promising. I am looking into it, will probably require some rewriting to fit the standards of the rest of the code though. Do you mind if I push to your branch?
For lambda functions I create this macro:
#define CREATE_TASK_LAMBDA(lambda) createNewTask(lambda, #lambda, __FUNCTION__)
#define createTask(...) CREATE_TASK_LAMBDA(__VA_ARGS__)
#define CREATE_TIMED_TASK_LAMBDA(delay, lambda) createNewTask(delay, lambda, #lambda, __FUNCTION__)
#define createTimedTask(delay, ...) CREATE_TIMED_TASK_LAMBDA(delay, __VA_ARGS__)
#define CREATE_SCHEDULER_TASK_LAMBDA(delay, lambda) createNewSchedulerTask(delay, lambda, #lambda, __FUNCTION__)
#define createSchedulerTask(delay, ...) CREATE_SCHEDULER_TASK_LAMBDA(delay, __VA_ARGS__)
#define ADD_GAME_TASK_LAMBDA(lambda) addNewGameTask(lambda, #lambda, __FUNCTION__)
#define addGameTask(...) ADD_GAME_TASK_LAMBDA(__VA_ARGS__)
#define ADD_GAME_TASK_TIMED_LAMBDA(delay, lambda) addNewGameTaskTimed(delay, lambda, #lambda, __FUNCTION__)
#define addGameTaskTimed(delay, ...) ADD_GAME_TASK_TIMED_LAMBDA(delay, __VA_ARGS__)
For lambda functions I create this macro:
#define CREATE_TASK_LAMBDA(lambda) createNewTask(lambda, #lambda, __FUNCTION__) #define createTask(...) CREATE_TASK_LAMBDA(__VA_ARGS__) #define CREATE_TIMED_TASK_LAMBDA(delay, lambda) createNewTask(delay, lambda, #lambda, __FUNCTION__) #define createTimedTask(delay, ...) CREATE_TIMED_TASK_LAMBDA(delay, __VA_ARGS__) #define CREATE_SCHEDULER_TASK_LAMBDA(delay, lambda) createNewSchedulerTask(delay, lambda, #lambda, __FUNCTION__) #define createSchedulerTask(delay, ...) CREATE_SCHEDULER_TASK_LAMBDA(delay, __VA_ARGS__) #define ADD_GAME_TASK_LAMBDA(lambda) addNewGameTask(lambda, #lambda, __FUNCTION__) #define addGameTask(...) ADD_GAME_TASK_LAMBDA(__VA_ARGS__) #define ADD_GAME_TASK_TIMED_LAMBDA(delay, lambda) addNewGameTaskTimed(delay, lambda, #lambda, __FUNCTION__) #define addGameTaskTimed(delay, ...) ADD_GAME_TASK_TIMED_LAMBDA(delay, __VA_ARGS__)
Did you test it on Windows and Linux?
I've seen some 'lambda functions' solution, but it only worked on Windows (VS), on Linux it did not compile at all.
As I remember, it was adding ( and ) to every lambda call. Maybe yours is real lambda functions solution.
How are they reported in dispatcher.log? As full functions code?
This is promising. I am looking into it, will probably require some rewriting to fit the standards of the rest of the code though. Do you mind if I push to your branch?
I made TFS 1.4 official commit/PR, as it was only version I had on release time. I'm very busy at work :( I've got permission - from Kondra on OTLand - to release it since almost 3 months (Nov 12, 2022). I've posted TFS 0.4 'OTS stats' on OTLand - .zip attached to first post.
Push it to YOUR branch. Just remember to mention kondra ([email protected]) to keep it in-line with kondra's 'free' release rules.
EDIT: I've got confirmation that with @conde2 changes it reports pretty fine 'lambda' functions of TFS 1.5:
[06/02/2023 21:18:56]
Thread: 1 Cpu usage: 2.20086% Idle: 81.6726% Other: 16.1266% Players online: 1
Time (ms) Calls Rel usage % Real usage % Description
4831 1 73.17477% 1.61047% ([=, services = &serviceManager]() { mainLoader(argc, argv, services); })
724 13 10.97336% 0.24151% ([=, playerID = player->getID(), receiver = std::move(receiver), text = std::move(text)]() { g_game.playerSay(playerID, channelId, type, receiver, text); })
302 2253 4.58470% 0.10090% [=]() { checkCreatures((index + 1) % EVENT_CREATURECOUNT); }
260 9313 3.93936% 0.08670% [=]() { g_luaEnvironment.executeTimerEvent(lastTimerEventId); }
191 14866 2.89354% 0.06368% [&]() { sendAll(bufferedProtocols); }
101 3957 1.53302% 0.03374% [this]() { checkDecay(); }
54 2033 0.83108% 0.01829% [id = getID()]() { g_game.checkCreatureWalk(id); }
45 5 0.68775% 0.01514% [thisPtr = getThis()]() { thisPtr->logout(true, false); }
35 6 0.53411% 0.01176% ([=, thisPtr = getThis(), characterName = std::move(characterName)]() { thisPtr->login(characterName, accountId, operatingSystem); })
28 1 0.43887% 0.00966% sigbreakHandler
Did not test it myself, but someone sent me report like that on Discord.
EDIT 2
Change above 'for lambda' requires changes in all lambda functions! Adding ( and ) to each call like, replace:
g_scheduler.addEvent(createSchedulerTask(walkDelay, [=, id = getID()]() { g_game.forceRemoveCondition(id, type); }));
with:
g_scheduler.addEvent(createSchedulerTask(walkDelay, ([=, id = getID()]() { g_game.forceRemoveCondition(id, type); })));
Tested on Windows VS code and it reports all the complete lambda function
Please add me the credits as well.
Shouldn't the addLuaStats be in protectedCall instead of callFunction?