forgottenserver icon indicating copy to clipboard operation
forgottenserver copied to clipboard

OTS Statistics by kondra ([email protected])

Open gesior opened this issue 2 years ago • 8 comments

I've got permission from kondra ([email protected]) to release this code. He only requested to leave information about author kondra ([email protected]) in commit message.

It's well known 'OTS stats' system. It tracks all dispatcher thread tasks, Lua function calls and SQL queries. It generates files dispatcher.log, lua.log and sql.log in data/logs/stats that looks like:

[29/01/2023 16:08:37]
Thread: 1 Cpu usage: 0.628848% Idle: 99.6671% Other: -0.295933% Players online: 0
 Time (ms)     Calls     Rel usage %    Real usage % Description
       167       120       88.80196%        0.55843% std::bind(&Game::checkDecay, this)
        20       300       10.83945%        0.06816% std::bind(&Game::checkCreatures, this, (index + 1) % EVENT_CREATURECOUNT)
         0        12        0.30616%        0.00193% std::bind(&Game::updateWorldTime, this)

Report interval is configurable in config.lua.

It also reports slow/very slow function calls in files dispatcher_slow.log, lua_slow.log and sql_slow.log. Ex. server start takes over 50 ms, it's reported as very slow dispatcher_very_slow.log:

[29/01/2023 16:06:07] Execution time: 466 ms - std::bind(mainLoader, argc, argv, &serviceManager) - main

Lua startup.lua executes over 10 ms, it's reported as slow:

[29/01/2023 16:06:07] Execution time: 35 ms - data/globalevents/scripts/startup.lua:onStartup - 

SQL took over 10 ms, it's reported as slow:

[29/01/2023 16:06:07] Execution time: 24 ms - TRUNCATE TABLE `towns` - TRUNCATE TABLE `towns`

Slow/very slow time is configurable in config.lua

There are also special.log, special_slow.log and special_very_slow.log, which report your custom statistics. Ex. to report decay time of items, in game.cpp under:

void Game::internalDecayItem(Item* item)
{

add:

AutoStat stat("internalDecayItem", std::to_string(item->getID()));

String internalDecayItem will be used for total CPU usage report. Second parameter (item ID) will be reported in slow and very_slow logs. In this case, it would say which item ID decayed over 10 ms (single item decay time, not all items with this item ID). Most OTSes pass player name as second parameter. It let them track which player generated 'slow' actions - lagged OTS.

It's MR draft

There are few things to test:

  • how addEvent Lua functions are reported (is it Unknown or prints whole function Lua code), when you pass anonymous function ex.:
addEvent(function(x) print(x) end, 100)
  • how dispatcher tasks using lambda functions are reported (is it Unknown or prints whole function C++ code) ex.:
g_scheduler.addEvent(createSchedulerTask(OUTPUTMESSAGE_AUTOSEND_DELAY.count(), [&]() { sendAll(bufferedProtocols); }));
  • does it compile on Windows

Feature implementation could also be better separated using STATS_ENABLED definition. Right now, when you compile it without STATS_ENABLED, it still pass function name for every dispatcher task, but it passes empty string: https://github.com/gesior/forgottenserver-gesior/blob/0ad4be6cab3e335e6f7dc767722f70ee0c849be8/src/stats.h#L17-L23

gesior avatar Jan 29 '23 15:01 gesior

This is promising. I am looking into it, will probably require some rewriting to fit the standards of the rest of the code though. Do you mind if I push to your branch?

ranisalt avatar Feb 03 '23 21:02 ranisalt

For lambda functions I create this macro:

	#define CREATE_TASK_LAMBDA(lambda) createNewTask(lambda, #lambda, __FUNCTION__)
	#define createTask(...) CREATE_TASK_LAMBDA(__VA_ARGS__)

	#define CREATE_TIMED_TASK_LAMBDA(delay, lambda) createNewTask(delay, lambda, #lambda, __FUNCTION__)
	#define createTimedTask(delay, ...) CREATE_TIMED_TASK_LAMBDA(delay, __VA_ARGS__)

	#define CREATE_SCHEDULER_TASK_LAMBDA(delay, lambda) createNewSchedulerTask(delay, lambda, #lambda, __FUNCTION__)
	#define createSchedulerTask(delay, ...) CREATE_SCHEDULER_TASK_LAMBDA(delay, __VA_ARGS__)

	#define ADD_GAME_TASK_LAMBDA(lambda) addNewGameTask(lambda, #lambda, __FUNCTION__)
	#define addGameTask(...) ADD_GAME_TASK_LAMBDA(__VA_ARGS__)

	#define ADD_GAME_TASK_TIMED_LAMBDA(delay, lambda) addNewGameTaskTimed(delay, lambda, #lambda, __FUNCTION__)
	#define addGameTaskTimed(delay, ...) ADD_GAME_TASK_TIMED_LAMBDA(delay, __VA_ARGS__)

conde2 avatar Feb 06 '23 14:02 conde2

For lambda functions I create this macro:

	#define CREATE_TASK_LAMBDA(lambda) createNewTask(lambda, #lambda, __FUNCTION__)
	#define createTask(...) CREATE_TASK_LAMBDA(__VA_ARGS__)

	#define CREATE_TIMED_TASK_LAMBDA(delay, lambda) createNewTask(delay, lambda, #lambda, __FUNCTION__)
	#define createTimedTask(delay, ...) CREATE_TIMED_TASK_LAMBDA(delay, __VA_ARGS__)

	#define CREATE_SCHEDULER_TASK_LAMBDA(delay, lambda) createNewSchedulerTask(delay, lambda, #lambda, __FUNCTION__)
	#define createSchedulerTask(delay, ...) CREATE_SCHEDULER_TASK_LAMBDA(delay, __VA_ARGS__)

	#define ADD_GAME_TASK_LAMBDA(lambda) addNewGameTask(lambda, #lambda, __FUNCTION__)
	#define addGameTask(...) ADD_GAME_TASK_LAMBDA(__VA_ARGS__)

	#define ADD_GAME_TASK_TIMED_LAMBDA(delay, lambda) addNewGameTaskTimed(delay, lambda, #lambda, __FUNCTION__)
	#define addGameTaskTimed(delay, ...) ADD_GAME_TASK_TIMED_LAMBDA(delay, __VA_ARGS__)

Did you test it on Windows and Linux? I've seen some 'lambda functions' solution, but it only worked on Windows (VS), on Linux it did not compile at all. As I remember, it was adding ( and ) to every lambda call. Maybe yours is real lambda functions solution. How are they reported in dispatcher.log? As full functions code?

This is promising. I am looking into it, will probably require some rewriting to fit the standards of the rest of the code though. Do you mind if I push to your branch?

I made TFS 1.4 official commit/PR, as it was only version I had on release time. I'm very busy at work :( I've got permission - from Kondra on OTLand - to release it since almost 3 months (Nov 12, 2022). I've posted TFS 0.4 'OTS stats' on OTLand - .zip attached to first post.

Push it to YOUR branch. Just remember to mention kondra ([email protected]) to keep it in-line with kondra's 'free' release rules.

EDIT: I've got confirmation that with @conde2 changes it reports pretty fine 'lambda' functions of TFS 1.5:

[06/02/2023 21:18:56]
Thread: 1 Cpu usage: 2.20086% Idle: 81.6726% Other: 16.1266% Players online: 1
 Time (ms)     Calls     Rel usage %    Real usage % Description
      4831         1       73.17477%        1.61047% ([=, services = &serviceManager]() { mainLoader(argc, argv, services); })
       724        13       10.97336%        0.24151% ([=, playerID = player->getID(), receiver = std::move(receiver), text = std::move(text)]() { g_game.playerSay(playerID, channelId, type, receiver, text); })
       302      2253        4.58470%        0.10090% [=]() { checkCreatures((index + 1) % EVENT_CREATURECOUNT); }
       260      9313        3.93936%        0.08670% [=]() { g_luaEnvironment.executeTimerEvent(lastTimerEventId); }
       191     14866        2.89354%        0.06368% [&]() { sendAll(bufferedProtocols); }
       101      3957        1.53302%        0.03374% [this]() { checkDecay(); }
        54      2033        0.83108%        0.01829% [id = getID()]() { g_game.checkCreatureWalk(id); }
        45         5        0.68775%        0.01514% [thisPtr = getThis()]() { thisPtr->logout(true, false); }
        35         6        0.53411%        0.01176% ([=, thisPtr = getThis(), characterName = std::move(characterName)]() { thisPtr->login(characterName, accountId, operatingSystem); })
        28         1        0.43887%        0.00966% sigbreakHandler

Did not test it myself, but someone sent me report like that on Discord. EDIT 2 Change above 'for lambda' requires changes in all lambda functions! Adding ( and ) to each call like, replace:

g_scheduler.addEvent(createSchedulerTask(walkDelay, [=, id = getID()]() { g_game.forceRemoveCondition(id, type); }));

with:

g_scheduler.addEvent(createSchedulerTask(walkDelay, ([=, id = getID()]() { g_game.forceRemoveCondition(id, type); })));

gesior avatar Feb 06 '23 20:02 gesior

Tested on Windows VS code and it reports all the complete lambda function

Please add me the credits as well.

conde2 avatar Feb 06 '23 21:02 conde2

Shouldn't the addLuaStats be in protectedCall instead of callFunction?

conde2 avatar Feb 07 '23 01:02 conde2