epicenter icon indicating copy to clipboard operation
epicenter copied to clipboard

Feature idea: Smart paragraphs and punctuation check

Open rick-urb opened this issue 1 year ago • 3 comments

Though it would slow down the generation of text, I often find myself wishing that a block of text gets edited a little bit by AI via punctuation and some paragraphs to make it more readable. I very often have to do this manually afterwards.

There is a cool app on the App Store and Play Store called "Letterly" where you can talk into the microphone, and then you get a transcription and you can also have AI edit it without changing the text. This kind of effect would be amazing.

I don't want the text to be turned into a formal email or stuff like this. Just a quick check on punctuation and empty rows and so on to make a text more readable where it's sensible.

Maybe this could be added as an optional checkbox, with a quick note that it will slow down text generation for the benefit of better-looking text.

rick-urb avatar Jul 02 '24 10:07 rick-urb

Don't mean to speak for @braden-w, but this definitely does not sound like something "easy" to implement, due to variability and ill-definition.

Not saying it is not possible. In fact once Whispering gets the output onto one's clipboard, one could "pipe" it to another AI tool such as GPT-4o for post-processing, with specific instructions. That will consume extra tokens, of course.

Just take a look here at OpenAI's own blurb: https://platform.openai.com/docs/guides/speech-to-text/improving-reliability.

For punctuation and paragraph "rendering", you may also be able to resort to other non-AI techniques.

doxgt avatar Jul 04 '24 23:07 doxgt

This would be solvable by the user if output post processing with custom prompts gets added.

cgbur avatar Jul 04 '24 23:07 cgbur

This would be solvable by the user if output post processing with custom prompts gets added.

Prompting definitely helps. But to my personal disappointment it does not "solve" it. I have to rely on regex (i.e., "non-AI-based" post-processing) extensively to mitigate it.

doxgt avatar Jul 05 '24 00:07 doxgt

Clossing this since custom prompts have been added! For reference, you can configure a user prompt on the transcription level, or even better, configure transformations as demonstrated in the demo or the advanced features section of the README.

Image

braden-w avatar Jul 08 '25 18:07 braden-w