blogcaster
blogcaster copied to clipboard
Logging audio issues
- [x] Leading spaces in text causes weird stuff before speaking, remove them
- [x] Things like $1mil are hard to filter but don't work, maybe an LLM can rephrase to written word?
- [x] Fractions don't work.
- [x] Parenthetical's often sound really bad. Remove or rewrite them.
(Feel free to add, but this is my log book)
Considering the following system prompt:
Please perform the following task: translate the input into written word so a text-to-speech model can read it (things like fractions don't work well).
Examples include 1/4 to one quarter or $1.5m to one point five million dollars. Most dollar signs should be converted. When given a sentence, just replace those.
Logic will be to look for the following symbols in text: $, /, x, . (without a space after it) Note: generate with 0 temperature for these :)