omnivore icon indicating copy to clipboard operation
omnivore copied to clipboard

Bug in parsing some tts data

Open jacksonh opened this issue 1 year ago • 6 comments

Most likely something with quoting

From a user:

I’ve encountered this text to speech playback error often in the current and past versions of omnivore where the voice playback simply stops mid sentence randomly and the play/pause button vanishes. If I minimize the audio player, the banner says in red font “there was a problem playing the audio” or the like, and simply re-playing does not help get past the blockage. What I suspect is the culprit is the circled isolated quote in this screenshot. Playback failures tend to occur a little before these lone quotation marks and if I start the audio from after the quote, it is able to finish playing. Not 100% sure but hope it helps reproducing the bug. This particular article is here: https://archive.ph/20230806165118/https://www.nytimes.com/2023/08/04/opinion/oppenheimer-stalin.html
Screenshot 2024-01-16 at 08 58 12

jacksonh avatar Jan 16 '24 00:01 jacksonh

Tokenized sentences [
  'Nor was Stalin any kind of naïve, unsuspecting victim of Hitler’s Barbarossa onslaught, as some historical clichés would have it.',
  'McMeekin makes an extended case that Stalin was preparing to attack Nazi Germany when Hitler attacked him, that the two dictators were basically in a race to see who could mobilize to betray the other first — and that the initial Soviet debacle in 1941 happened in part because Stalin was also pushing his military toward an offensive alignment, and they were caught in a “mid-mobilization limbo.',
  '”'
]

sywhb avatar Jan 16 '24 01:01 sywhb

For some reason I'm still getting this issue on iOS production. Smart quotes seem to break TTS

DavisOwen avatar Apr 20 '24 00:04 DavisOwen

For some reason I'm still getting this issue on iOS production. Smart quotes seem to break TTS

Thanks, do you have an article with them that you could share?

jacksonh avatar Apr 20 '24 01:04 jacksonh

Ok I spent quite a bit of time playing around trying to reliably reproduce this. The issue appears to be if

  1. A line starts with a quotation (doesn't matter if its smart quote or not). and there is no space between the quote and the first character.
  2. The line is a run-on sentence, with no periods or newlines and it spans over 256 characters.

Here is an example text (randomly generated):

"I never knew what hardship looked like until it started raining bowling balls Not all people who wander are lost Peanuts don't grow on trees, but cashews do Even with the snow falling outside, she felt it appropriate to wear her bikini He was sitting in a trash can with high street class

I used this free website to generate a link you can use to generate an omnivore article to reproduce https://ctxt.io/2/AADIS7tQFg. Just save that and try to TTS it.

DavisOwen avatar Apr 20 '24 17:04 DavisOwen

I've added a few examples in https://github.com/omnivore-app/omnivore/issues/4008

saeedesmaili avatar May 29 '24 10:05 saeedesmaili

Are there any updates on this issue?

saeedesmaili avatar Oct 13 '24 22:10 saeedesmaili