marker icon indicating copy to clipboard operation
marker copied to clipboard

MathML in the HTML Output to Make it Accessible to Screen Readers

Open mohammad-tau opened this issue 10 months ago • 2 comments

This tool is a great accessibility tool. Thanks a lot for creating it. It’s a game changer for me as a blind person. I am a blind user who needs these kinds of conversions so the PDF becomes accessible to me when reading it either with HTML or Markdown, especially the LaTeX conversion and the images extraction. I've tried the tool on a PDF document where MathPix service failed to give a useful result, and the output pretty much impressed me. So I am thinking of trying it more on the ArXiv articles I need to read. The barrier is that the HTML doesn't seem to include the equations written in MathML, so a screen reader can consume and present them properly. I haven’t tried the --use_llm option, but I am not sure it will make any difference. Please correct me if I'm wrong. Of course, reading the Markdown version is a fallback mechanism, but having the HTML properly constructed is a big plus. I am sure there are several solutions to convert the math in Markdown to MathML. I can help to investigate and implement them in my spare time.

mohammad-tau avatar Feb 19 '25 13:02 mohammad-tau

I'm really glad this is helping you, mohammad! By default, marker will convert all block math to LaTeX. The --use_llm flag will also convert all the inline math to LaTeX. Can screen readers work with LaTeX, or is MathML a requirement? LaTeX to mathml shouldn't be too hard to do.

VikParuchuri avatar Feb 19 '25 14:02 VikParuchuri

Regards;

Loai Abushawar


This e-mail message (including attachments, if any) is intended for the use of the individual to whom it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, you are notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender and erase this e-mail message immediately.

On Wed, 19 Feb 2025 at 17:14, mohammad-tau @.***> wrote:

This tool is a great accessibility tool. Thanks a lot for creating it. It’s a game changer for me as a blind person. I am a blind user who needs these kinds of conversions so the PDF becomes accessible to me when reading it either with HTML or Markdown, especially the LaTeX conversion and the images extraction. I've tried the tool on a PDF document where MathPix service failed to give a useful result, and the output pretty much impressed me. So I am thinking of trying it more on the ArXiv articles I need to read. The barrier is that the HTML doesn't seem to include the equations written in MathML, so a screen reader can consume and present them properly. I haven’t tried the --use_llm option, but I am not sure it will make any difference. Please correct me if I'm wrong. Of course, reading the Markdown version is a fallback mechanism, but having the HTML properly constructed is a big plus. I am sure there are several solutions to convert the math in Markdown to MathML. I can help to investigate and implement them in my spare time.

— Reply to this email directly, view it on GitHub https://github.com/VikParuchuri/marker/issues/563, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZ25IYOXCBQRKN3WVNFDZ32QR7TFAVCNFSM6AAAAABXOFIC6GVHI2DSMVQWIX3LMV43ASLTON2WKOZSHA3DGMZQG4YTQMY . You are receiving this because you are subscribed to this thread.Message ID: @.***> [image: mohammad-tau]mohammad-tau created an issue (VikParuchuri/marker#563) https://github.com/VikParuchuri/marker/issues/563

This tool is a great accessibility tool. Thanks a lot for creating it. It’s a game changer for me as a blind person. I am a blind user who needs these kinds of conversions so the PDF becomes accessible to me when reading it either with HTML or Markdown, especially the LaTeX conversion and the images extraction. I've tried the tool on a PDF document where MathPix service failed to give a useful result, and the output pretty much impressed me. So I am thinking of trying it more on the ArXiv articles I need to read. The barrier is that the HTML doesn't seem to include the equations written in MathML, so a screen reader can consume and present them properly. I haven’t tried the --use_llm option, but I am not sure it will make any difference. Please correct me if I'm wrong. Of course, reading the Markdown version is a fallback mechanism, but having the HTML properly constructed is a big plus. I am sure there are several solutions to convert the math in Markdown to MathML. I can help to investigate and implement them in my spare time.

— Reply to this email directly, view it on GitHub https://github.com/VikParuchuri/marker/issues/563, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZ25IYOXCBQRKN3WVNFDZ32QR7TFAVCNFSM6AAAAABXOFIC6GVHI2DSMVQWIX3LMV43ASLTON2WKOZSHA3DGMZQG4YTQMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

loaishawar avatar Feb 19 '25 16:02 loaishawar