WhisperLive icon indicating copy to clipboard operation
WhisperLive copied to clipboard

Simple Client Recording Attempt

Open justinlevi opened this issue 2 years ago • 24 comments

I start up the server via $ python ./run_server.py

(whisper_live)  whisperlive git:(main)✗  🚀 python ./run_server.py
Downloading: "https://github.com/snakers4/silero-vad/archive/master.zip" to /Users/justinwinter/.cache/torch/hub/master.zip
2023-08-21 12:14:34.119619 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '628'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119647 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '629'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119652 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '623'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119655 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '625'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119659 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '620'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119696 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '139'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119701 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '131'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119704 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '140'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119708 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '134'. It is not used by any node and should be removed from the model.
2023-08-21 12:14:34.119711 [W:onnxruntime:, graph.cc:3543 CleanUnusedInitializersAndNodeArgs] Removing initializer '136'. It is not used by any node and should be removed from the model.
ERROR:root:no close frame received or sent

Then start up the client via:

(whisper_live)  whisperlive git:(main)✗  🚀 python ./run_client.py
[INFO]: * recording
[INFO]: Waiting for server ready ...
False en transcribe
[INFO]: Opened connection
[INFO]: Server Ready!
Traceback (most recent call last):
  File "/Users/justinwinter/projects/whisperlive/./run_client.py", line 3, in <module>
    client()
  File "/Users/justinwinter/projects/whisperlive/whisper_live/client.py", line 298, in __call__
    self.client.record()
  File "/Users/justinwinter/projects/whisperlive/whisper_live/client.py", line 234, in record
    data = self.stream.read(self.CHUNK)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper_live/lib/python3.9/site-packages/pyaudio/__init__.py", line 570, in read
    return pa.read_stream(self._stream, num_frames,
OSError: [Errno -9981] Input overflowed

// run_client.py

from whisper_live.client import TranscriptionClient
client = TranscriptionClient("0.0.0.0", "8080", is_multilingual=False, lang="en", translate=False)
client()

justinlevi avatar Aug 21 '23 16:08 justinlevi

Are you running this on a mac?

zoq avatar Aug 22 '23 16:08 zoq

A temporary workaround is to set exception_on_overflow=False in https://github.com/collabora/WhisperLive/blob/main/whisper_live/client.py#L234.

This might cause that we skip some frames, we are looking into updating the frame rate for the different platforms.

zoq avatar Aug 22 '23 16:08 zoq

@zoq Thanks for the idea. Yes, I am on a mac. I tried setting the exception_on_overflow=False but still getting the same error:

(whisper_live)  whisperlive git:(main)✗  🚀 python ./run_client.py
[INFO]: * recording
[INFO]: Waiting for server ready ...
False en transcribe
[INFO]: Opened connection
[INFO]: Server Ready!
Traceback (most recent call last):
  File "/Users/justinwinter/projects/whisperlive/./run_client.py", line 3, in <module>
    client()
  File "/Users/justinwinter/projects/whisperlive/whisper_live/client.py", line 299, in __call__
    self.client.record()
  File "/Users/justinwinter/projects/whisperlive/whisper_live/client.py", line 235, in record
    data = self.stream.read(self.CHUNK)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper_live/lib/python3.9/site-packages/pyaudio/__init__.py", line 570, in read
    return pa.read_stream(self._stream, num_frames,
OSError: [Errno -9981] Input overflowed
image

    def record(self, out_file="output_recording.wav"):
        n_audio_file = 0
        # create dir for saving audio chunks
        if not os.path.exists("chunks"):
            os.makedirs("chunks", exist_ok=True)
        try:
            for _ in range(0, int(self.RATE / self.CHUNK * self.RECORD_SECONDS)):
                if not Client.RECORDING: break
                self.exception_on_overflow=False
                data = self.stream.read(self.CHUNK)
                self.frames += data

                audio_array = Client.bytes_to_float_array(data)
                
                self.send_packet_to_server(audio_array.tobytes())

                # save frames if more than a minute
                if len(self.frames) > 60*self.RATE:
                    t = threading.Thread(
                        target=self.write_audio_frames_to_file,
                        args=(self.frames[:], f"chunks/{n_audio_file}.wav", )
                    )
                    t.start()
                    n_audio_file += 1
                    self.frames = b""

justinlevi avatar Aug 22 '23 18:08 justinlevi

Okay, I can reproduce the issue on a mac, I'll come up with a fix.

zoq avatar Aug 22 '23 20:08 zoq

welp

aavetis avatar Nov 09 '23 21:11 aavetis

rip?

Geczy avatar Nov 12 '23 07:11 Geczy

welp

Do you still have this problem with the latest release?

zoq avatar Nov 12 '23 18:11 zoq

I have pulled the latest changes from the git repo and I still have this issue on my mac. Have you made changes regarding this issue? Thanks.

sbrnaderi avatar Nov 18 '23 11:11 sbrnaderi

OK, I just tried to increase the self.chuck value to 1024 * 4 and I don't get the error anymore and the transcription works fine. This is in client.py file. Screenshot 2023-11-18 at 12 56 03

sbrnaderi avatar Nov 18 '23 11:11 sbrnaderi

This is on a mac?

zoq avatar Nov 18 '23 19:11 zoq

This is on a mac?

Yes, this is on MacBook Pro (intel based). I increased the chunk size and I could get the code to work. I also noticed that if I use the bigger Whisper model (medium), then I have to increase this further to 1024 * 8.

sbrnaderi avatar Nov 18 '23 20:11 sbrnaderi

This issue still persists on macbook. Tried sending an audio file, is working just fine.

arbianqx avatar Dec 14 '23 15:12 arbianqx

Looking into it today.

zoq avatar Dec 14 '23 15:12 zoq

OK, I just tried to increase the self.chuck value to 1024 * 4 and I don't get the error anymore and the transcription works fine. This is in client.py file. Screenshot 2023-11-18 at 12 56 03

tried that but for me it still crashed with the original error.

niderhoff avatar Dec 18 '23 13:12 niderhoff

I can confirm if I update the chunk size it works. @niderhoff let try to figure out why it's not working on your system. Just to make sure our setup is the same. You are using the pip package, and not the docker container, or do you run the scripts without the pip package?

zoq avatar Dec 18 '23 20:12 zoq

I have the same problem.

[INFO]: * recording
[INFO]: Waiting for server ready ...
True ko transcribe
[INFO]: Opened connection
[INFO]: Server Ready!
Traceback (most recent call last):
  File "/Users/asadal/Documents/Dev/Hani/WhisperLive_streamlit.py", line 13, in <module>
    client()
  File "/Users/asadal/miniconda3/lib/python3.10/site-packages/whisper_live/client.py", line 490, in __call__
    self.client.record()
  File "/Users/asadal/miniconda3/lib/python3.10/site-packages/whisper_live/client.py", line 371, in record
    data = self.stream.read(self.chunk)
  File "/Users/asadal/miniconda3/lib/python3.10/site-packages/pyaudio/__init__.py", line 570, in read
    return pa.read_stream(self._stream, num_frames,
OSError: [Errno -9981] Input overflowed

Macbook Pro 14 M1 Pro. Simple client Recording. I used pip package.

# Run the client
from whisper_live.client import TranscriptionClient
client = TranscriptionClient(
  "localhost",
  9090,
  is_multilingual=True,
  lang="ko",
  translate=False,
  model_size="small"
)
client()

Then, I encountered an error,

TypeError: TranscriptionClient.__init__() got an unexpected keyword argument 'model_size'

So, I disabled the option, model_size="small" and ran again. But an error occured, OSError: [Errno -9981] Input overflowed

I changed self.chunk = 1024 to self.chunk = 1024 * 4. But encounterd same error.

asadal avatar Dec 24 '23 22:12 asadal

We have to release the latest pip package, in the meantime you can remove model_size="small" from the TranscriptionClient call. For the overflow issue, can you try stream.read(self.chunk, exception_on_overflow=False) in https://github.com/collabora/WhisperLive/blob/main/whisper_live/client.py#L415

zoq avatar Dec 24 '23 23:12 zoq

Thanks zoq, I'll wait for update pip. Thank you for creating such a great application.

Best Regards.

asadal avatar Dec 24 '23 23:12 asadal

@asadal pip package is updated. Let us know if you are still facing the issue.

makaveli10 avatar Jan 02 '24 05:01 makaveli10

I have had success with the newest version and setting exception_on_overflow to False. Can this be set by default?

kjyv avatar Jan 02 '24 12:01 kjyv

https://github.com/collabora/WhisperLive/pull/83 does that, we will merge it and release a new version

zoq avatar Jan 02 '24 13:01 zoq

I got segmentation fault with the latest version...

JonathanLehner avatar Jan 12 '24 22:01 JonathanLehner

@JonathanLehner can you share more details when do you see the segfault?and does it happen always on the latest version?

makaveli10 avatar Jan 19 '24 05:01 makaveli10

I just tried the demo from the Readme:

llm_server.py from whisper_live.server import TranscriptionServer server = TranscriptionServer() server.run("0.0.0.0", 8080)

llm_client.py from whisper_live.client import TranscriptionClient client = TranscriptionClient( "localhost", 8080, is_multilingual=True, lang="en", translate=False, #model_size="tiny" )

client() #client("audio_test.wav")

python llm_server.py zsh: segmentation fault python llm_server.py (physiotherapy) jonathan@Jonathans-MBP physiotherapy % /usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

JonathanLehner avatar Jan 20 '24 01:01 JonathanLehner