audioread
audioread copied to clipboard
stream not properly closed (NoBackendError | OSError: Too many open files)
import audioread
filename = 'some_audio_file.ogg'
try:
for K in range(2048):
with audioread.audio_open(filename) as audio:
print(K, audio.duration)
except audioread.exceptions.NoBackendError:
with open(filename, 'rb') as file: # OSError: [Errno 24] Too many open files
pass
the audio is openned in a loop using with, but it seems that is not properly closed in my system it will print about 500 lines, then raises the NoBackendError and the OSError when trying to open a new file
using lsof shows lines of the form (my limit is 1024)
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python 38124 ne555 1023u unix 0x00000000d813f451 0t0 2420780 type=STREAM (CONNECTED)
I've observed this issue with audioread.gstdec.GstAudioFile and audioread.ffdec.FFmpegAudioFile backends, couldn't test with audioread.rawread.RawAudioFile
Thanks for the complete script for testing this! I wasn't able to reproduce this with some quick testing (macOS, FFmpeg backend, using lsof to check if things went out of control). I don't have access to something using Gstreamer at the moment, but I would very much believe that that backend could have some kind of a leak.
For FFmpeg in particular, is there any chance you can check whether the loopy script also leaves a bunch of ffmpeg processes running? Like, I can imagine ps | grep ffmpeg showing 1024 processes if we're not correctly cleaning things up there.
good morning, I didn't realise that before trying to open the file with FFmpegAudioFile it was using GstAudioFile
when limited the backends to only FFmpegAudioFile it worked fine.
so the issue seems to be only with the GstAudioFile backend (either if it opens or not the file correctly)
Ah, that makes sense! I unfortunately don't have a great way to test this out here… I can imagine that we may need to do some additional explicit resource cleanup here: https://github.com/beetbox/audioread/blob/ff9535df934c48038af7be9617fdebb12078cc07/audioread/gstdec.py#L378-L406
But it will require some real GStreamer expertise or trial and error to figure out exactly what to clean up.
I have run into the same problem while using librosa.load.
Passing a file descriptor instead of the path helped:
with open(path, "rb") as fp:
librosa.load(fp)