parser
parser copied to clipboard
Parsing lead_image_url when there are multiple og:image's present
- Platform: OS X
- Mercury Parser Version: 2.2.0
- Node Version (if a Node bug): v12.16.2
Expected Behavior
If a site has og:image set twice, it would choose one of them as the lead_image_url. Obviously having duplicate og:image's specified is a mistake but I would still like to handle parsing the image out in this scenario.
Current Behavior
It chooses neither of the images and ends up just choosing another image on the page
Steps to Reproduce
const MercuryParser = require("@postlight/mercury-parser");
const x = await MercuryParser.parse("https://www.realityblurred.com/realitytv/2017/08/ayto-season-six-host-terrence-j/"); // Any page with two `og:image`'s set
console.log(x.lead_image_url);
This prints https://www.realityblurred.com/realitytv/wp-content/themes/realityblurred/images/Andy-Dehnart.jpg, which is the first image in the body of the page. The page itself does have an identical og:image, but it is specified twice in the head:
<meta property="og:image" content="https://www.realityblurred.com/realitytv/images/2017/08/ayto-season-six-cast.jpg">
Detailed Description
I'm trying to get the lead image url out of the above page.
Possible Solution
If there are multiple og:image's present in a page, choose the first one.