pheme
pheme copied to clipboard
Can V100 train models?
I was wondering if it was possible to infer coherent speech word by word, without the whole sentence requiring to be passed together. This would greatly decrease the inference time...
Does the autoregressive decoding of the T2S stage induce random hallucinated results such as repeated words/phrases or long silences? how does this relate to the reported WER results?
We have added code for a WebUI using Gradio. The UI looks like the following 
Hey, first of all thanks for doing and publishing a great work! But coming at the practical side, I am looking at rendering my favourite set of matrix quotes: ```...
Hello everyone, I've noticed that throughout the pipeline, [unknown tokens are removed](https://github.com/PolyAI-LDN/pheme/blob/main/data/semantic_dataset.py#L122), and that the `unique_text_tokens.k2symbols` doesn't contém all necessary phonemes for Non-English languages, such as accents and other diacritics....
It pronounces $4,000 as "Four zero zero zero" sometimes "Four zero zero zero zero zero"
Hi - First, thank you for sharing your impressive work! I was able to train a model in another language with similar amount of data as your 100M parameter model....
[The result of Tensorrt-llm](https://github.com/PolyAI-LDN/pheme#a100-gpu--100m-pheme-variant) is very amazing. If this is real, we streaming is not needed at all.