parser
parser copied to clipboard
Key used to indicate the number of pages rendered by `parse` isn't always the same
- Platform: Darwin Malos-MBP 19.4.0 Darwin Kernel Version 19.4.0: Wed Mar 4 22:28:40 PST 2020; root:xnu-6153.101.6~15/RELEASE_X86_64 x86_64 i386 MacBookPro16,1 Darwin
- Mercury Parser Version: 2.2.0 (latest release)
Expected Behavior
Regardless of whether parse renders a single page or multiple pages, the key in the object returned by parse indicating how many pages have been rendered should be the same.
Current Behavior
If parse renders a single page the key is rendered_pages, otherwise it's pages_rendered.
Steps to Reproduce
❯ node --experimental-repl-await
Welcome to Node.js v12.15.0.
Type ".help" for more information.
> const Mercury = require('@postlight/mercury-parser')
undefined
> const singlePage = 'https://daringfireball.net/linked/2020/04/27/snell-ipad-magic-keyboard'
undefined
> const multiPage = 'https://arstechnica.com/gadgets/2016/08/the-connected-renter-how-to-make-your-apartment-smarter'
undefined
> (await Mercury.parse(singlePage)).rendered_pages
1
> (await Mercury.parse(singlePage)).pages_rendered
undefined
> (await Mercury.parse(multiPage)).rendered_pages
undefined
> (await Mercury.parse(multiPage)).pages_rendered
3
> (await Mercury.parse(multiPage, { fetchAllPages: false })).rendered_pages
1
> (await Mercury.parse(multiPage, { fetchAllPages: false })).pages_rendered
undefined
> .exit
Possible Solution
Replace all instances of pages_rendered with rendered_pages in the codebase. Doing the opposite would also fix the issue, but seem best to do the former since Mercury's documentation references rendered_pages.