mdn
mdn copied to clipboard
Document reader mode from author's perspective
Proposal
Reader mode (aka reader view) may be confusing for document authors. It uses various heuristics to extract content from the page, but those heuristics are not documented. I think that MDN should document how reader mode works, and what an author can do if their page is displayed incorrectly in reader mode.
Browser support
Reader mode is widely supported by browsers.
Tasks
No response
Dependencies
No response
Additional information
No response
Are you willing to support this work?
No response
Thanks for the proposal. I did a little digging and I can share that at least in Firefox this is the main criteria:
- Content type needs to be HTML
- Certain domains are blocked
- Content length needs to be at least 140 char long
- "content score" - checking for certain tags / metadata to determine if there is high likelihood of text content, reported back in a score; hyperlink ratio determines if it's likely to be main content or navigational parts of the page
- Content visibility - invisible elements or nodes w/o height and width are excluded
See
- https://searchfox.org/mozilla-central/source/toolkit/components/reader/Readerable.js#28-44
- https://github.com/mozilla/readability with
isProbablyReaderablebeing the first past to see if Firefox shows the reader view icon in the toolbar.
Where this is documented on MDN is a different question.
Unfortunately, it’s not a standard, and each browser implements reader mode in its own way. Even worse, browsers don’t consider this a developer-facing feature, so they don’t publish any docs, don’t talk about breaking changes, etc. You can only guess or reverse-engineer it based on source code. Also, these heuristics are constantly changing, and nobody can guarantee you that MDN docs will stay relevant even in a few months.
Yes, there could be a hand-wavy guide outlining a basic approach and suggesting common patterns that browsers (and other apps and services) use. But it won’t be a reference or something solid that developers could rely on.
Anything for https://developer.mozilla.org/en-US/docs/Web/HTML/How_to that provides hints above, including heavy caveats that this can change / deviates between browsers?