NPC's have a (Lua) memory leak & crash potential
Before creating an issue, please ensure:
- [x] This is a bug in the software that resides in this repository, and not a support matter (use https://otland.net/forums/support.16/ for support)
- [x] This issue is reproducible without changes to the C++ code in this repository
Steps to reproduce (include any configuration/script required to reproduce)
- Stick this code in a talkaction
local tests = {
npc = 1,
monster = 2,
item = 3,
}
local activeTest = tests.npc
local monsterName = "Enslaved Dwarf"
local npcName = "Banker"
local itemId = 2148
local function runTest(playerPos)
for i = 1, 5000 do
if activeTest == tests.npc then
local npc = Game.createNpc(npcName, playerPos)
npc:remove()
elseif activeTest == tests.monster then
local monster = Game.createMonster(monsterName, playerPos)
monster:remove()
elseif activeTest == tests.item then
local item = Game.createItem(itemId, 100, playerPos)
item:remove()
end
end
end
function onSay(player, words, param)
local playerPos = player:getPosition()
local start = os.clock()
runTest(playerPos)
local reloadStart = os.clock()
Game.reload(RELOAD_TYPE_TALKACTIONS)
local testEnd = os.clock()
print("Test #1: Test: "..reloadStart - start.." - Reload: "..testEnd - reloadStart.." - Total: "..testEnd - start)
start = os.clock()
runTest(playerPos)
reloadStart = os.clock()
Game.reload(RELOAD_TYPE_TALKACTIONS)
testEnd = os.clock()
print("Test #2: Test: "..reloadStart - start.." - Reload: "..testEnd - reloadStart.." - Total: "..testEnd - start)
start = os.clock()
runTest(playerPos)
reloadStart = os.clock()
Game.reload(RELOAD_TYPE_TALKACTIONS)
testEnd = os.clock()
print("Test #3: Test: "..reloadStart - start.." - Reload: "..testEnd - reloadStart.." - Total: "..testEnd - start)
playerPos:sendMagicEffect(CONST_ME_BLOCKHIT)
print(("-"):rep(50))
return false
end
- Run the talkaction a few times and keep an eye on the memory usage of the process
Expected behaviour
NPC's cleanly deallocate their memory upon being removed
Actual behaviour
NPC's do not deallocate their memory upon being removed
Observations
- After commenting out this call to
lua_gc, reload times stay consistent, which gives evidence of the Lua environment getting trashed https://github.com/otland/forgottenserver/blob/e1e1bc21a72034cdc57cfbc95701182943307471/src/luascript.cpp#L4604 - Using
local activeTest = tests.monsterorlocal activeTest = tests.itemdoes not produce the same memory leak - It doesn't matter which reload type we use, whether it's
RELOAD_TYPE_TALKACTIONS,RELOAD_TYPE_CREATURESCRIPTSorRELOAD_TYPE_ACTIONSetc, the entire Lua environment is affected
Test Outputs
With the call to lua_gc enabled (increasing timings)
Test #1: Test: 3.841 - Reload: 0.059000000000001 - Total: 3.9
Test #2: Test: 3.913 - Reload: 0.11 - Total: 4.023
Test #3: Test: 3.833 - Reload: 0.167 - Total: 4
--------------------------------------------------
Test #1: Test: 3.832 - Reload: 0.221 - Total: 4.053
Test #2: Test: 3.841 - Reload: 0.274 - Total: 4.115
Test #3: Test: 3.896 - Reload: 0.33 - Total: 4.226
--------------------------------------------------
Test #1: Test: 3.897 - Reload: 0.383 - Total: 4.28
Test #2: Test: 3.916 - Reload: 0.439 - Total: 4.355
Test #3: Test: 3.934 - Reload: 0.492 - Total: 4.426
--------------------------------------------------
Test #1: Test: 3.954 - Reload: 0.547 - Total: 4.501
Test #2: Test: 3.971 - Reload: 0.594 - Total: 4.565
Test #3: Test: 3.988 - Reload: 0.657 - Total: 4.645
--------------------------------------------------
Test #1: Test: 4.006 - Reload: 0.706 - Total: 4.712
Test #2: Test: 3.993 - Reload: 0.76400000000001 - Total: 4.757
Test #3: Test: 4.019 - Reload: 0.815 - Total: 4.834
--------------------------------------------------
Test #1: Test: 4.017 - Reload: 0.87 - Total: 4.887
Test #2: Test: 4.017 - Reload: 0.929 - Total: 4.946
Test #3: Test: 4.061 - Reload: 0.982 - Total: 5.043
--------------------------------------------------
With the call to lua_gc disabled (consistent timings)
Test #1: Test: 3.875 - Reload: 0.0040000000000013 - Total: 3.879
Test #2: Test: 3.921 - Reload: 0.0040000000000013 - Total: 3.925
Test #3: Test: 3.911 - Reload: 0.0040000000000013 - Total: 3.915
--------------------------------------------------
Test #1: Test: 4.046 - Reload: 0.0030000000000001 - Total: 4.049
Test #2: Test: 3.844 - Reload: 0.0040000000000013 - Total: 3.848
Test #3: Test: 3.967 - Reload: 0.0039999999999978 - Total: 3.971
--------------------------------------------------
Test #1: Test: 4.112 - Reload: 0.0049999999999955 - Total: 4.117
Test #2: Test: 4.124 - Reload: 0.0039999999999978 - Total: 4.128
Test #3: Test: 3.94 - Reload: 0.0039999999999978 - Total: 3.944
--------------------------------------------------
Test #1: Test: 3.974 - Reload: 0.0040000000000049 - Total: 3.978
Test #2: Test: 4.007 - Reload: 0.0039999999999978 - Total: 4.011
Test #3: Test: 4.217 - Reload: 0.0049999999999955 - Total: 4.222
--------------------------------------------------
Test #1: Test: 4.528 - Reload: 0.0049999999999955 - Total: 4.533
Test #2: Test: 4.186 - Reload: 0.0039999999999907 - Total: 4.19
Test #3: Test: 4.046 - Reload: 0.0030000000000001 - Total: 4.049
--------------------------------------------------
Test #1: Test: 4.108 - Reload: 0.0039999999999907 - Total: 4.112
Test #2: Test: 4.33 - Reload: 0.0040000000000049 - Total: 4.334
[Warning - NpcScript::NpcScript] Can not load script: bank.lua
not enough memory
Memory Graphs
NPC Test

Monster Test

Item Test

Impact on the server
Eventually, the Lua environment will run out of available memory to allocate and you will get the following error, after which the server will shortly crash
[Warning - NpcScript::NpcScript] Can not load script: bank.lua
not enough memory
Probably the npc system is at fault (I think it's known from having memory leaks), if you empty the lua files to bare minimum is it still leaking? Or if you use another npc.
For all of these tests I was using 33333 iterations per test, to get to a total of 100k spawns
local function runTest(playerPos)
for i = 1, 33333 do
if activeTest == tests.npc then
local npc = Game.createNpc(npcName, playerPos)
npc:remove()
elseif activeTest == tests.monster then
local monster = Game.createMonster(monsterName, playerPos)
monster:remove()
elseif activeTest == tests.item then
local item = Game.createItem(itemId, 100, playerPos)
item:remove()
end
end
end
Results
The first dip is always after the talkaction finished executing, the 2nd dip is always after manually running /reload npc after execution finishes
Test 1 (Removing the script="bank.lua" tag from the Banker.xml file entirely
Looks like we got perfect memory deallocation in this case

Test 2 (Bare minimum NPC script file)
It's seemingly less with an "empty" NPC script file, but I can still observe a leak coming from data/npc/lib/npc.lua (more specifically, data/npc/lib/npcsystem.lua)

Test 3 (Adding KeywordHandler)

Test 4 (Adding NpcHandler)

Test 5 (Parsing parameters & enabling NpcHandler callbacks)

Test 6 (Adding FocusModule)

Test 7 (Adding a few keywords to listen for)

There's some exception with code 0xE24C4A02 spamming the log when running a debugger
it's being launched from this line (and has npc onThink in stacktrace):
https://github.com/otland/forgottenserver/blob/0e20d54eebc6be8ea7ed50d0fd3d31d5d43df1f9/src/luascript.cpp#L300
could someone confirm this with unmodded tfs?
https://github.com/otland/forgottenserver/pull/4483