luv icon indicating copy to clipboard operation
luv copied to clipboard

memory leak related to `uv.send_async`

Open FelipeLema opened this issue 4 years ago • 6 comments

I made an ad-hoc by-line process output reader and it seems that I found a memory leak.

I wrapped into a min working example below. This is supposed to uv.async_send the lines of the output of a certain process. I setup a process that prints data endlessly. The output is stored in current_chunk, but it's only temporary: each time data arrives, I look at the chunk and extract the lines, except for the last one (in case the last line did not arrive completely).

If you keep the uv.send_async line below, memory consumption of the lua process will grow endlessly. If you remove it, the memory consumption remains still. In both cases, each line is discarded.

local uv = require("luv") -- "luv" when stand-alone, "uv" in luvi apps-- WIP luv (related to luvit)

local stdout = uv.new_pipe()
local handle, pid =
   uv.spawn(
      "yes",
      {stdio = {nil, stdout, nil},
       args = {}
      },
      function(code,signal)
         -- ignore
      end)

local current_chunk  = "" -- accumulated data until a \n
local no_more_output = false
-- process each line
cb = uv.new_async(function(line)
      -- discard line
      if no_more_output then
         cb:close()
      end
end)

uv.read_start(stdout, function(err, data)
                 assert(not err, err)
                 if data then
                    -- append received data into current_chunk
                    current_chunk = string.format("%s%s", current_chunk, data)
                 else
                    no_more_output= true
                 end

                 -- split chunk into lines to send to `callback`
                 lines={}
                 for line in string.gmatch(current_chunk, "[^\n]+") do
                       table.insert(lines, line)
                    end
                    if not no_more_output then
                       -- pop last chunk back into current chunk for next iteration
                       last_item_position = #lines
                       current_chunk = table.remove(lines, last_item_position)
                    end 

                    -- send the lines to the callback
                    for _,l in pairs(lines) do
                       -- this line below creates a memory leak
                       uv.async_send(cb, l)
                    end

                    -- close uv stuff if everything's done
                    if no_more_output then
                       uv.async_send(cb, nil)
                    end

end)
uv.run()

Happens with these lua versions:

% lua -v
Lua 5.3.3  Copyright (C) 1994-2016 Lua.org, PUC-Rio
% luajit -v
LuaJIT 2.1.0-beta3 -- Copyright (C) 2005-2017 Mike Pall. http://luajit.org/

FelipeLema avatar Jul 07 '20 15:07 FelipeLema

As far as I can tell, what's happening here is this, from the Libuv docs:

Warning: libuv will coalesce calls to uv_async_send(), that is, not every call to it will yield an execution of the callback. For example: if uv_async_send() is called 5 times in a row before the callback is called, the callback will only be called once. If uv_async_send() is called again after the callback was called, it will be called again.

Luv currently expects a callback for every call of async_send to be able to free the memory for the data sent. Since not all sends are actually getting a callback called, though, that memory is never being freed.

With your example code, a script that prints the numbers 1-100, and a couple prints added:

read_start cb	"0\n"
lines to be sent	{  }
read_start cb	"1\n2\n3\n4\n5\n6\n"
lines to be sent	{ "01", "2", "3", "4", "5" }
read_start cb	"7\n8\n9\n10\n11\n12\n13\n"
lines to be sent	{ "67", "8", "9", "10", "11", "12" }
async cb	{ "12" }
read_start cb	"14\n15\n16\n17\n18\n19\n20\n"
lines to be sent	{ "1314", "15", "16", "17", "18", "19" }
read_start cb	"21\n22\n23\n24\n25\n"
lines to be sent	{ "2021", "22", "23", "24" }
async cb	{ "24" }
...

That's 2 async_cb's called for around 24 async_send calls.

I'm not totally sure what the solution here would be, though.

squeek502 avatar Oct 22 '20 15:10 squeek502

This is actually both a memory leak and a bug, as we are dropping data here that we shouldn't be. Minimal test case:

  test("multiple sends coalesce", function(print, p, expect, uv)
    local async
    async = uv.new_async(expect(function(data)
      p(data)
      uv.close(async)
    end))
    uv.async_send(async, 'this string will be lost')
    uv.async_send(async, 'this string will be sent')
  end)

This will only print

"this string will be sent"

meaning the "this string will be lost" string is both leaked and is never given to the send callback.

I think we need to treat async data as a FIFO queue that we push onto in luv_async_send and then pop until it's empty in luv_async_cb, calling into Lua with the popped data each time. That is, we need to coalesce the data in the same way that Libuv coalesces calls to uv_async_send.

squeek502 avatar Oct 22 '20 16:10 squeek502

Seems like a good fix, but I wonder if any users are somehow relying on the libuv behavior.

SinisterRectus avatar Oct 24 '20 14:10 SinisterRectus

Libuv doesn't handle any passing of data, so the coalesced callbacks are reasonable there (the only intention of Libuv's async is just to wake up the loop from any thread).

I guess it's possible that some Luv user is calling async_send with the same data multiple times and then expecting only 1 callback call with that data, but it seems fairly unlikely, and, with our current implementation, that will leak the data for every coalesced callback.

It's a little unfortunate that our binding does not match the Libuv code; it'd probably be better if we had split the actual data sending part into a luv-specific function (async_send_data or something).

squeek502 avatar Oct 24 '20 18:10 squeek502

Ah okay. A queue makes sense then.

SinisterRectus avatar Oct 24 '20 19:10 SinisterRectus

I don't mind changing my code at all, although I believe anyone using uv.send_async should be covered by either fixing this coalesced-call-memory-leak within luv code or by the leak being thoroughly documented (perhaps a warning?)

FelipeLema avatar Oct 26 '20 14:10 FelipeLema