Erik Hermansen comments

Results 27 comments of


                                            Erik Hermansen

Added script to compute phoneme labels and timestamps

I would also love to see this merged. I've written automatic lip synch animation software based on Vosk using word timings. The algorithm makes guesses about the timings of phonemes....

Added script to compute phoneme labels and timestamps

@kevin, Rhubarb looks cool. I will check it out more later. On Wed, May 31, 2023, 5:38 PM Kevin Harrington ***@***.***> wrote: > I would also love to see this...

Two problems when using vosk-browser with non-streaming, separated static waveforms

@lheine10, another workaround - admittedly, not ideal... You can pass silence samples to `acceptWaveformFloat()` to induce the final result. I'm not a maintainer/official project person, so don't interpret the suggestion...

Two problems when using vosk-browser with non-streaming, separated static waveforms

Example of forcing a result: ``` // kaldiSampleRate = whatever kaldiRecognizer was constructed with. const silenceSamples = createSilenceSamples(kaldiSampleRate, 2000); kaldiRecognizer.acceptWaveformFloat(silenceSamples, kaldiSampleRate; ``` Code for `createSilenceSamples()`: https://github.com/erikh2000/sl-web-audio/blob/main/src/generating/silenceUtil.ts

Alternative?

@msqr1 I'm interested in your project, but I'm likely to stick with vosk-browser out of inertia and not having any complaints with it. The main thing I saw in Vosklet...

Alternative?

No worries, @msqr1. I don't expect you to be super-scientific in your claims. I was just curious about what kind of speed increase you might be seeing. Your changes for...

Bug in read of RIFF signature..?

I think I saw the same problem. I'll add a little more information here. I get an exception inside of `readLISTINFOSubChunks_()` where the second `subChunk` param has a subchunk similar...

Bug in read of RIFF signature..?

Here is a WAV file that should reproduce the exception: [female-sad-2.wav](https://github.com/erikh2000/wisp/blob/main/public/speech/test-voices/female-sad-2.wav) I have all rights to the file and it's MIT-licensed from my repo. I am fine with it being...

Attempting to pass data to the KaldiRecognizer results in an odd internal error

Hey, Tim. What happens if you run the examples? Do you get the same error with their calls to acceptWaveform()? Just trying to narrow down the problem, e.g. maybe Vosk...

Is the demo working?

Works for me from here - https://ccoreilly.github.io/vosk-browser/ On Chrome MacOS. After loading "English" model, I said "hello" and it was recognized near instantly.