audioread icon indicating copy to clipboard operation
audioread copied to clipboard

Gstreamer backend seems to leak memory

Open albertz opened this issue 7 years ago • 9 comments

Here is a small script to demonstrate the issue. The memory consumption constantly grows (up to 8GB). See here for a discussion.

albertz avatar Feb 22 '18 15:02 albertz

Interesting! To narrow down what's going wrong, can you please do some more investigation to narrow down the leak to specific actions in the audioread library? We might have a shot at fixing this if you can point to exactly what's being leaked.

sampsyo avatar Feb 22 '18 15:02 sampsyo

See the script. Actually, the only thing I use is audio_open, looped over a lot of FLAC files. I call it only indirectly via librosa.load(filename, sr=None), which is a very straight-forward usage of audio_open.

albertz avatar Feb 22 '18 15:02 albertz

I understand, but that still doesn't point to exactly where the leak is coming from. It would be awesome to have your help investigating exactly what gets leaked and when.

sampsyo avatar Feb 22 '18 15:02 sampsyo

Yes, would be nice, but not sure if I have the time now (I already spent multiple hours in debugging this issue, and need to proceed with my actual work). I think you should be able to reproduce the issue with my script. As there as so many issues with Gstreamer anyway, I would maybe even suggest to completely remove it. My solution for now is to use PySoundFile instead of audioread. Btw., that is also what librosa is recommending.

albertz avatar Feb 22 '18 15:02 albertz

OK! Please check back in if you ever get the chance to help.

sampsyo avatar Feb 22 '18 15:02 sampsyo

I'm experiencing this issue with Beets 1.4.6 on Fedora 28.

I tried updating to Git master of audioread, as the unreleased version 2.1.7 contains an FD leak fix (https://github.com/beetbox/audioread/commit/72ed349c12a16ab741cb02abc4de8f2e8e7fe4ee). This change causes beet import to either segfault or to log the following traceback:

Exception in thread Thread-6:
Traceback (most recent call last):
  File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.6/site-packages/audioread/ffdec.py", line 69, in run
    data = self.fh.read(self.blocksize)
ValueError: I/O operation on closed file

Exception in thread Thread-7:
Traceback (most recent call last):
  File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.6/site-packages/audioread/ffdec.py", line 69, in run
    data = self.fh.read(self.blocksize)
ValueError: PyMemoryView_FromBuffer(): info->buf must not be NULL

I haven't been able to reproduce this issue using audioread/decode.py.

ssssam avatar Dec 25 '18 22:12 ssssam

That's troubling. @RyanMarcus, have you encountered this?

Perhaps, to reproduce the problem, one would need to decode several files in a row?

sampsyo avatar Dec 26 '18 15:12 sampsyo

I've managed to reproduce it now. The crash appears to be triggered if the .close() method is called before reading is complete. I'll open a separate MR with a fix (edit: https://github.com/beetbox/audioread/pull/78)

ssssam avatar Dec 26 '18 17:12 ssssam

Huh, that's strange -- it looks like a race. When the process is started, it seems like the reading process is delegated to a thread (i.e. QueueReaderThread). When close is called (possibly via __del__), my change closes the FDs, but potentially leaves the reader thread running.

I haven't tested this, but it would explain why a partial read is causing the issue.

RyanMarcus avatar Dec 26 '18 17:12 RyanMarcus