ewagner70

Results 18 comments of ewagner70

Hi, +1 with this issue ... no resolution?

> Uneducated guess: > > ```python > chunk_frames=np.frombuffer(b''.join(frames), dtype=np.float32) / 32768.0 > ``` no, would have wondered if that would work ... did you try it on your own? both...

> Try to save audio to a file and look at the differences. that's tricky as I use microphone and all I get is a byte-stream of float32 structure. It...

> @ewagner70 the reason behind `32768.0` is to normalize array value between -1 & 1 > > it's a specification on how to represent audio in numerical array the values...

> Hi @ewagner70 , do you have updates ? @anbzerc : unfortunately no update as I am at the end of my wisdom ... even the faster_whisper guys obviously don't...

> @ewagner70 @anbzerc You can try to use web RTC with [aiortc](https://github.com/aiortc/aiortc) on the Python backend. AioRTC handles the conversion of raw audio packets to av.AudioFrame (from PyAV). Because with...

> This example should be interesting : https://github.com/aiortc/aiortc/tree/main/examples/server I'll try it as soon as possible @anbzerc : this example uses ICE server ... when you solve it without using ICE,...

> @ewagner70 if you save the file to disk and pass it to the model, then the transcription is ok > > ``` > fname = r"C:\test.wav" > sig =...

> Greeting, > > My guess is that the data obtained from the microphone is 44100 or 48K, while the model supports 16K, so you got a strange output. And...

can be closed now ... the documentation is really sub-par and nothing for the weary ...