faster-whisper
faster-whisper copied to clipboard
transcribe can't find files outside current script working directory
Hi, I'm on a mac and I am trying to transcibe a audio file, extracted with yt_dlp. The problem is WhisperModel can't find or correctly process the audio files outside the code working directory.
def process_audios(self) -> bool:
exts = ['*.m4a', '*.mp3', '*.wav', '*.flac', '*.mp4', '*.wma', '*.aac', '*.ogg']
print(os.listdir(self.audio_path))
# ['Tutorial-Master Text Similarity Search with Python & FAISS Vector Database.m4a', 'g30 4.m4a']
for filename in os.listdir(self.audio_path):
if any(fnmatch.fnmatch(filename, extension) for extension in exts):
cur_file = os.path.join(self.audio_path, filename) # Absolute path
filename_extensionless = os.path.splitext(filename)[0]
print('cur_file is: ', cur_file) # /Users/lmonteir/.HandySpeechBot/projects/project_name/audios/Tutorial-Master Text Similarity Search with Python & FAISS Vector Database.m4a
print('is valid: ', os.path.isfile(cur_file)) # It says True
model = WhisperModel(model_size_or_path=self.app_data['user_config']['model'],
cpu_threads=self.app_data['user_config']['cpu_threads'],
download_root=self.models_path)
segments, info = model.transcribe(cur_file) # Error happens here.
This is the error stack:
Traceback (most recent call last):
File "/Users/lmonteir/Projects/handy_speech_bot/DataManager/project_manager.py", line 139, in <module>
m.process_audios()
File "/Users/lmonteir/Projects/handy_speech_bot/DataManager/project_manager.py", line 97, in process_audios
segments, info = model.transcribe(cur_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lmonteir/Projects/handy_speech_bot/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 294, in transcribe
audio = decode_audio(audio, sampling_rate=sampling_rate)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lmonteir/Projects/handy_speech_bot/lib/python3.12/site-packages/faster_whisper/audio.py", line 52, in decode_audio
for frame in frames:
File "/Users/lmonteir/Projects/handy_speech_bot/lib/python3.12/site-packages/faster_whisper/audio.py", line 103, in _resample_frames
for frame in itertools.chain(frames, [None]):
File "/Users/lmonteir/Projects/handy_speech_bot/lib/python3.12/site-packages/faster_whisper/audio.py", line 92, in _group_frames
fifo.write(frame)
File "av/audio/fifo.pyx", line 30, in av.audio.fifo.AudioFifo.write
File "av/audio/fifo.pyx", line 74, in av.audio.fifo.AudioFifo.write
RuntimeError: Could not allocate AVAudioFifo.
Now, if I put the files in the current script folder, it runs fine. I have tried putting double quotes between the filename and the absolute path, but I didn't work. Anything that I might be missing?
Make sure you are using the last faster-whisper version. Check what PyAV version is there too.
faster-whisper is on 1.0.1. Couldn't find a package named PyAV. I installed the version 12.0.5. Problem persists.
Let me know if you need more information. :) Thanks for the help!
Try to downgrade it, I don't have other ideas...
pip install --force-reinstall av==11.0.0
Try to downgrade it, I don't have other ideas...
pip install --force-reinstall av==11.0.0
It didn't work. What I tried was to use those generic audio converter websites to convert my .m4a to .mp3 and it worked nicely!
Now, this is what I dont understand. I can process local .m4a files with no problem, but not with absolute path. But .mp3 works fine with absolute path.
Maybe is there something related to my project?
I changed my hugging face cache to a folder in /Users/lmonteir/.HandySpeechBot/models.
It is a virtual env, created with python3 -m venv
I'm just confused, but now I have a workaround, which is nice.
Thank you for the support!