digtron icon indicating copy to clipboard operation
digtron copied to clipboard

Runtime error in util.lua:burn()

Open thomasrudin opened this issue 7 years ago • 12 comments

Stacktrace:

minetest_1                 | 2018-10-07 04:06:46: ERROR[Main]: ServerError: AsyncErr: ServerThread::run Lua: Runtime error from mod 'sauth' in callback node_on_receive_fields(): /data/world//worldmods/digtron/util.lua:175: bad argument #1 to 'pairs' (table expected, got nil)
minetest_1                 | 2018-10-07 04:06:46: ERROR[Main]: stack traceback:
minetest_1                 | 2018-10-07 04:06:46: ERROR[Main]: 	[C]: in function 'pairs'
minetest_1                 | 2018-10-07 04:06:46: ERROR[Main]: 	/data/world//worldmods/digtron/util.lua:175: in function 'burn'
minetest_1                 | 2018-10-07 04:06:46: ERROR[Main]: 	/data/world//worldmods/digtron/util_execute_cycle.lua:234: in function 'execute_dig_cycle'
minetest_1                 | 2018-10-07 04:06:46: ERROR[Main]: 	/data/world//worldmods/digtron/nodes/node_controllers.lua:146: in function 'auto_cycle'
minetest_1                 | 2018-10-07 04:06:46: ERROR[Main]: 	/data/world//worldmods/digtron/nodes/node_controllers.lua:260: in function </data/world//worldmods/digtron/nodes/node_controllers.lua:234>

Happened twice in a row on my server, should be reproducible somehow

thomasrudin avatar Oct 07 '18 07:10 thomasrudin

That means there was no inventory at location.pos or no fuel list in it. Sadly this sounds too like #5 and its successors #16 #17 #22. These may be Digtron bugs, or may expose some core bugs, like race conditions.

numberZero avatar Oct 12 '18 19:10 numberZero

Actually, this is a dupe of #16 and #22.

numberZero avatar Oct 12 '18 19:10 numberZero

I really need to collapse a bunch of these dupes together into one mega-issue.

One possible approach that comes to mind is to try just generally optimizing my code as much as possible. Digtron was actually my very first mod, and although I've continued tweaking and fiddling with it over the years it's still got a lot of creaky and suboptimal stuff in it. While I was working on a rather resource-intensive mapgen mod recently I've learned that Minetest's LUA environment does not suffer particularly gracefully when it runs into memory and garbage collection issues, it resulted in a bunch of "impossible" things happening. So maybe that's contributing to the underlying problem here.

FaceDeer avatar Jan 02 '19 07:01 FaceDeer

Can confirm this error still happens and is quite annoying. Might be worth looking into a different method of layout/meta handling.

GreenXenith avatar Jan 02 '19 20:01 GreenXenith

Anecdotal reference: it happens more frequently with mobile users on my server...

thomasrudin avatar Jan 02 '19 20:01 thomasrudin

Okay, this is definitely looking familiar now. I struggled with a similar bug in another mod just a few days ago. It turned out that I was allocating too many new tables too rapidly and Lua started silently failing to do so, resulting in "impossible" nil values where tables should be.

Maybe try out the optimization branch I'm currently working in, I've already cleaned up a lot of excess table burden. I haven't tested it extensively yet though (hence why it's still in a branch) so caveat emptor.

FaceDeer avatar Jan 03 '19 03:01 FaceDeer

Lua started silently failing

That’s definitely a bug, it must report OOM. Have you tested on classical Lua or LuaJIT? Either way, report it as MT issue; I suspect it’s MT what silents some errors.

On 03.01.2019 06:47, FaceDeer wrote:

Okay, this is definitely looking familiar now. I struggled with a similar bug in another mod just a few days ago. It turned out that I was allocating too many new tables too rapidly and Lua started silently failing to do so, resulting in "impossible" nil values where tables should be.

Maybe try out the optimization https://github.com/minetest-mods/digtron/tree/optimization branch I'm currently working in, I've already cleaned up a lot of excess table burden. I haven't tested it extensively yet though (hence why it's still in a branch) so caveat emptor.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/minetest-mods/digtron/issues/30#issuecomment-451050130, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAxMhFFU5FqLPloom-3y7cUaQOHAeHaks5u_X1egaJpZM4XLuW7.

numberZero avatar Jan 03 '19 14:01 numberZero

I've been doing my dev work in minetest-5.0.0-0e306c0-win64 from sfan5's builds, not sure which Lua it uses.

A possibly-related tidbit that may explain why standard stress tests might not be seeing this is that I only started encountering that similar bug I mentioned above when I was running Minetest on a computer that was starved for physical memory due to some other big processes being run. I wonder if perhaps Lua is only paying attention to how much it's using of its own memory sandbox and is unaware that the sandbox itself is being squeezed. I'll mention that in the issue when I report it on the MT issue tracker.

FaceDeer avatar Jan 04 '19 02:01 FaceDeer

Minetest currently uses LuaJIT which caps memory usage, in case that would have anything to do with it. My server runs on 4GB of RAM, where as I personally run 24GB (which consequently is useless in Minetest because of LuaJIT).

GreenXenith avatar Jan 04 '19 02:01 GreenXenith

Hmm, it may be that MT itself runs out of memory and passes nil values to Lua... It would be very interesting to take a snapshot.

EDIT: I just remembered that MS® version of new may not throw bad_alloc in the case of OOM (as it should), but return NULL instead (what it must never do). That might cause such symptoms, as nils you get are returned by C++ code, not by Lua.

numberZero avatar Jan 05 '19 17:01 numberZero

Well isn't this a dilly of a pickle, then. If this is the true problem then no wonder none of my attempts to reproduce these bugs or guard against them has worked. Every time I would have sat down with a fresh clean new world with only Digtron installed, and since I'd be focused on the task at hand I wouldn't have any other big background tasks running on my computer to sap its memory.

So in a nutshell, the lua code

local new_table = {"something"}

May sometimes, under conditions that may be dependent on what else is running on your computer at the time, assign nil to new_table instead of {"something"}. There are some things I can do to guard against this problem on the lua side of things. I've already done a pass to greatly reduce the inefficient allocation of new tables, and I can probably do more. I bet there's going to be some builtin functions that allocate tables that I just can't avoid, though.

I'm at work right now but I'll write this up for the MT issue tracker when I get home tonight.

FaceDeer avatar Jan 06 '19 18:01 FaceDeer

I was finally able to reproduce this on my local machine, and I think I have a fix - at least for the repro case I was getting. Digtron was sometimes reading incomplete layout information into memory when adjacent to unloaded blocks. When I guarded against that in a different manner than I was doing previously that seems to have stopped it reproducing for me again, so hopefully that was a big part of this.

FaceDeer avatar Jan 13 '19 23:01 FaceDeer