Josh Meyer
Josh Meyer
currently you can't boost phrases, only words split on whitespace. We should be able to boost phrases, too
STT.readthedocs.io is missing instructions on how to install from release page on raspberry pi's
Currently, to train a scorer you need to perform two key steps, after you have a cleaned text corpus: 1. train a KenLM model with `STT/data/lm/generate_lm.py` 2. package the model...
If you follow the steps in https://stt.readthedocs.io/en/latest/LANGUAGE_MODEL.html, you will not be able to train an LM because there is a dependency to *compiled* KenLM binaries. This is not at all...
it may be the case you run multiple training runs with the same checkpoints. each time you start a new training run, the `flags.txt` file gets overwritten. It would be...
We can point to the requirements for the base NVIDIA docker image, and give people an idea of what *should* work and what *definintely won't* work: From NVIDIA's TF container...
**Is your feature request related to a problem? Please describe.** Yes. I find myself saving training logs to a text file (what I consider) a hacky style. On a server,...
Scenario: I have a large dataset where all transcripts are in ALL CAPS, but the alphabet I want to use (i.e. fine-tune `v1.2.0`) is in lower case. Current solution: I...
the use of `webdataset` in training was added in release `v1.2.0`, but we don't have docs for it's usage yet in `stt.readthedocs.io`
TTS has some nice gifs in the docs [here](https://tts.readthedocs.io/en/latest/inference.html), and STT could easily have some too. A gif for stt-model-manager and a gif for using the python client would be...