Missing "alt" accessibility attributes from art-of-problem-solving
Hi openzim folks,
I wanted to let you know that the "art-of-problem-solving" resource is missing the accessibility attributes on its formulas. The formulas there displayed via images, done with <img class="latex"> tags.
On the live site each of those has an "alt" attribute that can be read by screen-reading software. Example page with some formulas at: https://artofproblemsolving.com/wiki/index.php/1962_AHSME_Problems/Problem_37
The kiwix resources I was examining are the two art-of-problem-solving zim files at: https://wiki.kiwix.org/wiki/Content_in_all_languages
And it appears all of those alt attributes have been stripped out? Would be interested to know why, but would be great if we can get them back in the kiwix zim files. Thanks!
@dginev Thank you for your bug report. alt DOM attributes are important and should not and are not stripped out AFAIK. We need to understand now why they are not there. Probably because the alternative Wiki code parser (called Parsoid) we use, does not generate them.
Two similar issues:
- https://github.com/openzim/mwoffliner/issues/164
- https://github.com/openzim/mwoffliner/issues/114
@dginev This is an old upstream bug https://phabricator.wikimedia.org/T209277. Actually this ticket is a duplicate of old #164. We had closed #164 because the previous reporter was happy with the workaround... but the bug is still there.
Thanks for tracking this down @kelson42 ! I am indeed one user that truly needs this if I am to rely on the kiwix sources. For art-of-problem-solving I have moved forward by freshly crawling and download it myself, and then working with the public HTML pages. Luckily it is relatively small. But it would have been so much better if I could assume all kiwix-distributed resources will have the accessibility tags preserved.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
I'm closing this ticket because we are not able anymore to scrape https://artofproblemsolving.com/wiki/ anymore. There is not remote API that we support anymore available.