audioread
audioread copied to clipboard
Error "cannot join thread before it is started" after upgrading to 3.0.0
Hello, I'm getting this threading error, after upgrading to the latest version.
This is currently my full stack trace
Huh, that's interesting! I don't see an immediate reason why #114 would have caused this, but perhaps @Bomme can see a reason?
Can you provide instructions to reproduce the problem? (That is, what code produced this crash?)
Huh, that's interesting! I don't see an immediate reason why #114 would have caused this, but perhaps @Bomme can see a reason?
Can you provide instructions to reproduce the problem? (That is, what code produced this crash?)
Thanks, I will check the implementation, because I'm not using it directly but via some other package, not sure which one. Let me dig into.
I don't think it's related to the latest changes. In the traceback in the screenshot line 300 in ffdec.py is self.stderr_reader.join()
.
Since the latest changes this is actually in line line 297
@loretoparisi maybe you can provide a pip freeze
output?
Indeed—maybe it would also be worth trying the same code on an older version to check whether the crash is truly new?
Thanks for helping guys, it seems we use it from librosa, and I assume it is using the pypi version so it should be in fact not the latest one from what I can see. We will try to reproduce it, by the way the first idea was that this could be related to an issue on the file system where the audio location was.
Since the error I see here is a threading error (join), what I'm not sure of, is if originates internally in your sdk due to I/O access issues, and throws outsides, or the log just says that while the FFPMEGAudioReader thread was running something externally occurred... As soon as I can reproduce it, I will tell you more.
Did you get anywhere with resolving this @loretoparisi?
I'm encountering the same issue using librosa to load audio from mp4 files.
nope I moved to native ffmpeg.
@DWhettam can you please share a stacktrace of the error that you see?
Sure. I get the "cannot join thread before it is started error" as well as "can't start new thread". I also get "Format not recognised." on the mp4 videos. This error occurs on a different mp4 each time, and only occurs for me when I try to read the audio and video from the mp4 file, if I just load the audio, or the video, I don't have any issues, so I'm fairly certain there is no issue with the files themselves, as I do not get that issue when reading the audio files. More specifically, if I call read_video(uses pytorchvideo), I have no errors. If I call read_audio(using librosa), I have no issues. But if I call one after the other, I encounter the below issue. Any help would be greatly appreciated!
Traceback (most recent call last):
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/audioread/ffdec.py", line 308, in __del__
self.close()
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/audioread/ffdec.py", line 297, in close
self.stderr_reader.join()
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/threading.py", line 1107, in join
raise RuntimeError("cannot join thread before it is started")
RuntimeError: cannot join thread before it is started
Traceback (most recent call last):
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/RepetitionCounting/train_multimodal_w_eval_stats.py", line 508, in <module>
for (i, data) in enumerate(dataloader):
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 633, in __next__
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1325, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/torch/_utils.py", line 644, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 5.
Original Traceback (most recent call last):
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/librosa/core/audio.py", line 175, in load
y, sr_native = __soundfile_load(path, offset, duration, dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/librosa/core/audio.py", line 208, in __soundfile_load
context = sf.SoundFile(path)
^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/soundfile.py", line 658, in __init__
self._file = self._open(file, mode_int, closefd)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/soundfile.py", line 1216, in _open
raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening '/raid/local_scratch/ddw69-wwp01/569411/countix_videos/sc5bsO7CYDs.mp4': Format not recognised.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
~~~~~~~~~~~~^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/RepetitionCounting/dataloader_multimodal.py", line 340, in __getitem__
audio = read_audio(video_name,start_crop,end_crop,self.add_noise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/RepetitionCounting/dataloader_multimodal.py", line 113, in read_audio
y, sr = librosa.load(video_filename, offset=start, duration=seg_duration)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/librosa/core/audio.py", line 183, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/librosa/util/decorators.py", line 59, in __wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/librosa/core/audio.py", line 239, in __audioread_load
reader = audioread.audio_open(path)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/audioread/__init__.py", line 127, in audio_open
return BackendClass(path)
^^^^^^^^^^^^^^^^^^
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/site-packages/audioread/ffdec.py", line 177, in __init__
self.stderr_reader.start()
File "/jmain02/home/J2AD001/wwp01/ddw69-wwp01/.conda/envs/repcount/lib/python3.11/threading.py", line 957, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
@Bomme by reducing the number of workers in my pytorch dataloader I am able to run the code for much longer, although the error is still occurring. Instead of within the first epoch of training, reducing the number of workers causes the error to occur in the sixth epoch. I'm not sure what to interpret from this, but at least this confirms the "Format not recognised" part of the stack trace is a misnomer.