Performance issue while iterating over a table of tables
Hi, I was trying to benchmark mlua to see what gain I can achieve by moving some of the logic of my program to mlua so I wrote this test:
use mlua::prelude::*;
fn sum(_lua: &Lua, t: LuaTable) -> LuaResult<f64> {
let mut ret: f64 = 0.;
for t in t.sequence_values::<LuaTable>() {
let t = t.unwrap();
let v: f64 = t.get("count").unwrap();
ret += v;
}
Ok(ret)
}
#[mlua::lua_module]
fn tmod(lua: &Lua) -> LuaResult<LuaTable> {
let exports = lua.create_table()?;
exports.set("sum", lua.create_function(sum)?)?;
Ok(exports)
}
I wrote this Lua code to benchmark it:
package.path = package.path .. ";.\\target\\release\\?.lua;"
package.cpath = package.cpath .. ";.\\target\\release\\?.dll;"
function sum_lua(l)
local ret = 0
for k, v in pairs(l) do
ret = ret + v["count"]
end
return ret
end
function gen_data()
local ret = {}
for i = 1, 10000 do
ret[#ret + 1] = { count = i }
end
return ret
end
function mar(func)
local totalRuntime = 0
local ret = 0
local testTimes = 100
for i = 1, testTimes do
local startTime = os.clock()
ret = ret + func()
local endTime = os.clock()
totalRuntime = totalRuntime + (endTime - startTime)
end
print(ret)
local averageRuntime = totalRuntime / testTimes
print("Average runtime for 1000 executions: " .. averageRuntime .. " seconds")
return averageRuntime
end
local data = gen_data()
tmod= require("tmod")
testsum = tmod.sum
print(sum_lua(data))
print(testsum(data))
local b = mar(function()
return testsum(data)
end)
local a = mar(function()
return sum_lua(data)
end)
print(b / a)
And i got this result:
50005000
50005000
5000500000
Average runtime for 1000 executions: 0.00559 seconds
5000500000
Average runtime for 1000 executions: 0.00038 seconds
14.710526315789
This means Rust code is doing the same task 14 times slower than Lua code. I wonder what caused that, I'm not really familiar with Lua internals. so I can't say if it's a bug or not.
Target Lua version: Lua 5.1.5 Copyright (C) 1994-2012 Lua.org, PUC-Rio Rust version: rustc 1.72.1 (d5c2e9c34 2023-09-13)
In the module mode there are some overhead related to safety checks (eg succesfull memory allocations and so on).
You can significatly improve performance if compile the module with skip_memory_check option.
#[mlua::lua_module(skip_memory_check)]
But I'm not surprised that Lua VM is faster in this particular case becase mlua uses public Lua API but Lua VM uses much faster private api bypassing some overhead.
Thank you for your response, Is it possible to add a memory pool to reuse allocations for mlua? do you believe that we have any other solution? Is it possible to achieve the performance of serialization for example CJSON with mlua using only safe code?
Is it possible to add a memory pool to reuse allocations for mlua?
I'll look into it. But not in module more, in this case mlua does not control lua vm allocations.
Is it possible to achieve the performance of serialization for example CJSON with mlua using only safe code?
I hope so. Optimizing serialization performance was not yet in my focus, but I'll look into it now. There are definitely some options that can (significantly) improve (tables) serialization performance.
By default when mlua iterate over table keys (hash part) it protects every call to lua_next (as it can trigger an exception if table is modified). Seems this can be simplified when we do serialization and not changing table content between iterations.