thorium-reader icon indicating copy to clipboard operation
thorium-reader copied to clipboard

Case of the Missing ‘a’s When ReadAloud Reads Math

Open brichwin opened this issue 4 years ago • 3 comments

The ReadAloud feature is not always speaking lower-case 'a's when they appear in MathML content. For example, if the EPUB contains MathML for the Quadratic Formula that results in the following MathSpeak Grammar from MathJax: "x equals StartFraction negative b plus-or-minus StartRoot b squared minus 4 a c EndRoot Over 2 a EndFraction", not all of the lower-case a's are read aloud. The last 'a' character present in the text (“… EndRoot Over 2 a EndFraction”) is not spoken.

I have attached a video recording of Thorium Reader v2.0.0-alpha.1.1962235861 speaking the quadratic formula here: https://user-images.githubusercontent.com/117550/161293839-94c3cef0-0828-42c1-bf64-2fdeb30ef1fb.mp4

I’m not sure why, but many TTS engines skip voicing lower case 'a's depending on where they appear. Also, the short 'a' sound ('ah') is not what we'd use when speaking the variable 'a' in math. We'd speak the variable using the long 'A' sound. I suggest transforming the Math Speech text using a regex prior to capitalize all lowercase 'a's before sending it to the TTS to be spoken. For example:

        speechText = speechText.trim().replace(/\ba\b/g,"A");

Then the 'a' variables will always be spoken by the TTS (I have tested this) and the resulting manner for speaking 'a's in math will be more natural.

As a second opinion on this issue, note that Niel Soiffer's MathCAT (https://nsoiffer.github.io/MathCAT/) appears to be up-casing lowercase 'a's as well.

brichwin avatar Apr 01 '22 15:04 brichwin

That's very interesting feedback, thank you. Also thank you very much for taking the time to provide video demonstration, much appreciated. Funnily enough, the recent builds of Thorium v2 include a fix for MathML / MathJax specifically for screen reader users. Before the fix, Thorium configured MathJax to produce its own TTS textual label, which would either override existing authored alt text, or provide the missing TTS-friendly description. Unfortunately this caused the MathML to loose its structure, which in turn broke the screen user experience with some Math TTS plugins. In order to fix this, Thorium now detects the presence of a screen reader so as to not instruct MathJax to produce its own accessible alt text for MathML markup (note that MathJax continues to visually render the MathML, but preserves structural MathML alongside the visual presentation, if I understand correctly).

danielweck avatar Apr 06 '22 08:04 danielweck

In light of the background information I posted in the previous comment, and having watched your video demonstration again (especially the last part about the "captions view"), it seems that the alt text is authored / baked into the original markup, unless the TTS-friendly textual description is automatically generated by MathJax (which I doubt as you were testing with a recent Thorium build which detects the screen reader to bypass this feature). Could you please check the HTML document to verify that the alt text is indeed authored at the source? If so, then we should be able to force uppercase on a characters using a regular expression. How confident are you that .replace(/\ba\b/g,"A") will not cause regression bugs in Math expressions that may use \ba\b in different contexts?

danielweck avatar Apr 06 '22 08:04 danielweck

The "alt-text" was being generated by the MathJax as the math was encoded using MathML in the EPUB.

This issue was discussed in the Math Writing Solutions for VI group that is led by Homiyar Mobedji. Neil Soiffer (author of MathCAT) said he discovered that a capital A will not always be pronounced as a long "A". His solution was to convert both solitary 'a' and 'A's into "eigh" in the text sent to the TTS engine.

brichwin avatar Jun 22 '22 19:06 brichwin