mwoffliner icon indicating copy to clipboard operation
mwoffliner copied to clipboard

Missing "alt" accessibility attributes from art-of-problem-solving

Open dginev opened this issue 4 years ago • 5 comments

Hi openzim folks,

I wanted to let you know that the "art-of-problem-solving" resource is missing the accessibility attributes on its formulas. The formulas there displayed via images, done with <img class="latex"> tags.

On the live site each of those has an "alt" attribute that can be read by screen-reading software. Example page with some formulas at: https://artofproblemsolving.com/wiki/index.php/1962_AHSME_Problems/Problem_37

The kiwix resources I was examining are the two art-of-problem-solving zim files at: https://wiki.kiwix.org/wiki/Content_in_all_languages

And it appears all of those alt attributes have been stripped out? Would be interested to know why, but would be great if we can get them back in the kiwix zim files. Thanks!

dginev avatar Feb 19 '21 12:02 dginev

@dginev Thank you for your bug report. alt DOM attributes are important and should not and are not stripped out AFAIK. We need to understand now why they are not there. Probably because the alternative Wiki code parser (called Parsoid) we use, does not generate them.

kelson42 avatar Feb 19 '21 13:02 kelson42

Two similar issues:

  • https://github.com/openzim/mwoffliner/issues/164
  • https://github.com/openzim/mwoffliner/issues/114

kelson42 avatar Feb 19 '21 13:02 kelson42

@dginev This is an old upstream bug https://phabricator.wikimedia.org/T209277. Actually this ticket is a duplicate of old #164. We had closed #164 because the previous reporter was happy with the workaround... but the bug is still there.

kelson42 avatar Mar 26 '21 07:03 kelson42

Thanks for tracking this down @kelson42 ! I am indeed one user that truly needs this if I am to rely on the kiwix sources. For art-of-problem-solving I have moved forward by freshly crawling and download it myself, and then working with the public HTML pages. Luckily it is relatively small. But it would have been so much better if I could assume all kiwix-distributed resources will have the accessibility tags preserved.

dginev avatar Mar 26 '21 17:03 dginev

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jun 02 '21 16:06 stale[bot]

I'm closing this ticket because we are not able anymore to scrape https://artofproblemsolving.com/wiki/ anymore. There is not remote API that we support anymore available.

kelson42 avatar Feb 05 '23 11:02 kelson42