MathML in the HTML Output to Make it Accessible to Screen Readers
This tool is a great accessibility tool. Thanks a lot for creating it. It’s a game changer for me as a blind person. I am a blind user who needs these kinds of conversions so the PDF becomes accessible to me when reading it either with HTML or Markdown, especially the LaTeX conversion and the images extraction. I've tried the tool on a PDF document where MathPix service failed to give a useful result, and the output pretty much impressed me. So I am thinking of trying it more on the ArXiv articles I need to read.
The barrier is that the HTML doesn't seem to include the equations written in MathML, so a screen reader can consume and present them properly. I haven’t tried the --use_llm option, but I am not sure it will make any difference. Please correct me if I'm wrong.
Of course, reading the Markdown version is a fallback mechanism, but having the HTML properly constructed is a big plus.
I am sure there are several solutions to convert the math in Markdown to MathML. I can help to investigate and implement them in my spare time.
I'm really glad this is helping you, mohammad! By default, marker will convert all block math to LaTeX. The --use_llm flag will also convert all the inline math to LaTeX. Can screen readers work with LaTeX, or is MathML a requirement? LaTeX to mathml shouldn't be too hard to do.
Regards;
Loai Abushawar
This e-mail message (including attachments, if any) is intended for the use of the individual to whom it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, you are notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender and erase this e-mail message immediately.
On Wed, 19 Feb 2025 at 17:14, mohammad-tau @.***> wrote:
This tool is a great accessibility tool. Thanks a lot for creating it. It’s a game changer for me as a blind person. I am a blind user who needs these kinds of conversions so the PDF becomes accessible to me when reading it either with HTML or Markdown, especially the LaTeX conversion and the images extraction. I've tried the tool on a PDF document where MathPix service failed to give a useful result, and the output pretty much impressed me. So I am thinking of trying it more on the ArXiv articles I need to read. The barrier is that the HTML doesn't seem to include the equations written in MathML, so a screen reader can consume and present them properly. I haven’t tried the --use_llm option, but I am not sure it will make any difference. Please correct me if I'm wrong. Of course, reading the Markdown version is a fallback mechanism, but having the HTML properly constructed is a big plus. I am sure there are several solutions to convert the math in Markdown to MathML. I can help to investigate and implement them in my spare time.
— Reply to this email directly, view it on GitHub https://github.com/VikParuchuri/marker/issues/563, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZ25IYOXCBQRKN3WVNFDZ32QR7TFAVCNFSM6AAAAABXOFIC6GVHI2DSMVQWIX3LMV43ASLTON2WKOZSHA3DGMZQG4YTQMY . You are receiving this because you are subscribed to this thread.Message ID: @.***> [image: mohammad-tau]mohammad-tau created an issue (VikParuchuri/marker#563) https://github.com/VikParuchuri/marker/issues/563
This tool is a great accessibility tool. Thanks a lot for creating it. It’s a game changer for me as a blind person. I am a blind user who needs these kinds of conversions so the PDF becomes accessible to me when reading it either with HTML or Markdown, especially the LaTeX conversion and the images extraction. I've tried the tool on a PDF document where MathPix service failed to give a useful result, and the output pretty much impressed me. So I am thinking of trying it more on the ArXiv articles I need to read. The barrier is that the HTML doesn't seem to include the equations written in MathML, so a screen reader can consume and present them properly. I haven’t tried the --use_llm option, but I am not sure it will make any difference. Please correct me if I'm wrong. Of course, reading the Markdown version is a fallback mechanism, but having the HTML properly constructed is a big plus. I am sure there are several solutions to convert the math in Markdown to MathML. I can help to investigate and implement them in my spare time.
— Reply to this email directly, view it on GitHub https://github.com/VikParuchuri/marker/issues/563, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZ25IYOXCBQRKN3WVNFDZ32QR7TFAVCNFSM6AAAAABXOFIC6GVHI2DSMVQWIX3LMV43ASLTON2WKOZSHA3DGMZQG4YTQMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>