RealtimeSTT macOS

Code：

import ssl

ssl._create_default_https_context = ssl._create_unverified_context
import torch

model, _ = torch.hub.load(repo_or_dir="snakers4/silero-vad", model="silero_vad", verbose=True)

if __name__ == '__main__':
    recorder = AudioToTextRecorder(spinner=False)

    print("Say something...")
    while (True):
        print(recorder.text(), end=" ", flush=True)

Error:

RealTimeSTT: root - ERROR - Unhandled exeption in _recording_worker: 
Exception in thread Thread-1 (_recording_worker):
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/hanxirui/workspace/python/DataScience/venv/lib/python3.11/site-packages/RealtimeSTT/audio_recorder.py", line 667, in _recording_worker
    while self.audio_queue.qsize() > self.allowed_latency_limit:
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/queues.py", line 126, in qsize
    return self._maxsize - self._sem._semlock._get_value()
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError```

Nov 09 '23 08:11 hanxirui

Damn. Researched it and yes, MacOS does not support queue.qsize() method from the multiprocessing module.

Need to find a workaround for this. Sorry for this issue.

Nov 09 '23 08:11 KoljaB

Updated audio_recorder.py to a new version which hopefully fixes this (not available with pip install yet). Would be great to hear feedback, if that works.

Nov 09 '23 09:11 KoljaB

Fix now available also with pip install (untested though, unfortunately I have no Mac):

pip install --upgrade realtimestt==0.1.7

Nov 09 '23 15:11 KoljaB

Any idea this error?

Say something...
RealTimeSTT: root - WARNING - Audio queue size exceeds latency limit. Current size: 84. Discarding old audio chunks.
zsh: segmentation fault  PYTHONPATH=. python tests/simple_test.py
Process Process-2:
Traceback (most recent call last):                                                                                                                                           
  File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/xiaopel/Github/Startup/RealtimeSTT/RealtimeSTT/audio_recorder.py", line 369, in _transcription_worker
    audio, language = conn.recv()
  File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/connection.py", line 255, in recv
    buf = self._recv_bytes()
  File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
    buf = self._recv(4)
  File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/connection.py", line 388, in _recv
    raise EOFError
EOFError

Nov 11 '23 07:11 eelxpeng

I'm sorry, I can't really tell what's going wrong here. After bit of research it seems Mac does have some issues with pythons multiprocessing. Maybe it's worth a try with a newer python version, python 3.9 is already two years old.

Nov 11 '23 10:11 KoljaB

thanks @KoljaB . I get around this by use a transcribe function, instead of sending data to transcribe via Pipe. From the design, it seems that there is no need for starting the process of transcription_worker. Anything I missed?

Nov 12 '23 18:11 eelxpeng

If you run recorder.text() in a loop the transcription of the last sentence pulls so much resources, that the voice activity detection is not reliable in the time the transcription runs. This is a problem if the transcription needs some time (long last sentence) and the next sentence is very short (depends on VAD). Then the short next sentence would not be detected. So basically it is a fix for a quite specialized problem. I did not realize that multiprocessing would introduce that many new problems as it did, especially for non-windows platforms.

Nov 12 '23 18:11 KoljaB

@eelxpeng what changes did you make in the audio_recorder.py, when replacing transcription_worker with transcribe, can you post the diff?

Dec 13 '23 17:12 abhishek-tg

Maybe this checkpoint helps, that was before introducing multiprocessing.

Dec 13 '23 18:12 KoljaB

Thanks but that causes the https://github.com/KoljaB/RealtimeSTT/issues/3 issue with stream closed, also torch with faster-whisper is already an issue: https://github.com/SYSTRAN/faster-whisper/issues/137

Dec 13 '23 18:12 abhishek-tg

Still the same issue - tried on python 3.11 and 3.12. Will take a look later.

Feb 03 '24 22:02 ekimia

Great work. KoljaB! I found the fix after breaking my head for a while in macOSX. replace the multiprocessing Queue with Manager.Queue. It works perfectly.

I had to settle on python 3.11 for faster-whisper and other dependencies to work

I don't want the wake words and other aspects. So i had to strip out certain aspects of the code. Still it serves my purpose.

Another thing I noticed was the device index. It works without passing, and in my case the mic was on the device 1. took me a while to list channels & identify the right value.

from multiprocessing import Manager

manager = Manager()
queue = manager.Queue()

# ... use the queue ...

if queue.qsize() > 0:  # Check for elements
    print("Queue has elements.")

Feb 09 '24 05:02 astuteprogrammer

Thanks a lot for this hint. I recently switched RealtimeSTT (and RealtimeTTS) from multiprocessing to torch.multiprocessing. Is the problem still existing with v1.9.0? (I hope to be lucky and the switch to torch.multiprocessing does the same for macOSX). For the device_index I prob need to add an option to list the devices.

Feb 09 '24 08:02 KoljaB

Unfortunately the switch to torch.multiprocessing reintroduced the qsize issue on macOS.

Mar 24 '24 18:03 fronx

is this using pytorch MPS acceleration?

Mar 30 '24 09:03 ehartford

Unfortunately the switch to torch.multiprocessing reintroduced the qsize issue on macOS.

I'll make a fix for this.

is this using pytorch MPS acceleration?

RealtimeSTT depends on the faster-whisper library, which in turn uses CTranslate2. This issue discussion from faster-whisper gh repo says there's no built-in support for AMD, MPS etc accelerations but it's possible to enable these backends by compiling CTranslate2 from the source with the desired backend before installing faster-whisper.

So - if I got this right - this would mean for MPS acceleration you would first compile CTranslate2 with the necessary backend support (MPS enabled). Then proceed with the installation of RealtimeSTT - which installs faster-whisper, but this should not override the manually compiled version of CTranslate2.

Mar 30 '24 13:03 KoljaB

Unfortunately the switch to torch.multiprocessing reintroduced the qsize issue on macOS.

Should be fixed with v0.1.12 now.

Mar 30 '24 16:03 KoljaB

Yay it's working! Thank you 🫶🏼

Mar 30 '24 22:03 fronx

Great work. KoljaB! I found the fix after breaking my head for a while in macOSX. replace the multiprocessing Queue with Manager.Queue. It works perfectly.

I had to settle on python 3.11 for faster-whisper and other dependencies to work

I don't want the wake words and other aspects. So i had to strip out certain aspects of the code. Still it serves my purpose.

Another thing I noticed was the device index. It works without passing, and in my case the mic was on the device 1. took me a while to list channels & identify the right value.
from multiprocessing import Manager

manager = Manager()
queue = manager.Queue()

# ... use the queue ...

if queue.qsize() > 0:  # Check for elements
    print("Queue has elements.")

Thank you, this Manager Queue works for me as well

Sep 19 '24 13:09 saurabh-ontoforce

RealtimeSTT RealtimeSTT copied to clipboard

macOS

RealtimeSTT
RealtimeSTT copied to clipboard