spchcat icon indicating copy to clipboard operation
spchcat copied to clipboard

Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

Results 17 spchcat issues
Sort by recently updated
recently updated
newest added

I am looking to use speechcat as in on demand .wav file transcription. However, I require the model to be preloaded and waiting for intermittent transcription of input .wav files....

I am trying to get spchcat working on my raspberry pi. When running the command it is printing in the console this: ``` TensorFlow: v2.3.0-14-g4bdd3955115 Coqui STT: v1.1.0-0-gf3605e23 ``` and...

As per the [man page](https://man7.org/linux/man-pages/man3/setenv.3.html), `setenv` requires `_POSIX_C_SOURCE` >= 200112L to be defined before including the appropriate header file (`stdlib.h`). As the other included header files include some standard headers...

When using the `TEST_FLTEQ` macro, pass float literals for the comparision argument, to avoid errors of the form: error: absolute value function 'fabsf' given an argument of type 'double' but...

It looks like `fread_and_discard` acts the same as [`fseek`](https://man7.org/linux/man-pages/man3/fseek.3.html). If so, this will replace calls to `fread_and_discard` with equivalent calls to `fseek` ✌️

Properly re-read the chunk ID when iterating through subsequent chunks. This avoids an infinite loop in the case where the `data` chunk doesn't immediately follow the `fmt ` chunk.

Running: `$ spchcat audio.wav` results in: ``` Warning: Model not found in /etc/spchcat/models/C/ Warning: Scorer not found in /etc/spchcat/models/C/ TensorFlow: v2.3.0-14-g4bdd3955115 Coqui STT: v1.1.0-0-gf3605e23 No model specified, cannot continue. Could...

People are going to have to have `sox` on their system to get this working: `$ sudo apt install sox`

Is it possible to supply the sound stream via stdin (or a pipe)? I need to make something like the following work in a shell script or similar: ``` arecord...

Have you looked into using Vosk as a backend? My tests with DeepSpeech, Coqui STT, and Vosk indicate that Vosk runs on older hardware and with higher accuracy.