jekyll-seo-tag icon indicating copy to clipboard operation
jekyll-seo-tag copied to clipboard

All pages with dates (e.g. collections) are tagged as `"article"` pages

Open CookiePLMonster opened this issue 7 months ago • 4 comments

Issue originally opened in https://github.com/jekyll/jekyll/issues/9817, moved to this repository according to @ashmaroli it's more of a jekyll-seo-tag issue:

When collection items do not get a date neither via a front matter or the filename, they are given a date that corresponds to the last time the website was built. This normally doesn't cause issues because the code using those collections doesn't need to care about dates, but it creates problems with this plugin, since all pages with a specified date are given an "article" type, and not "website". As far as I can tell, there is no way to opt-out of this behaviour in specific collection pages.

Code Sample

This issue is observable on my blog: view-source:https://cookieplmonster.github.io/mods/gta/

where all /mods/ pages are built from a _games collection, where items do not have dates specified: https://github.com/CookiePLMonster/CookiePLMonster.github.io/tree/master/_games

At the time of submitting this issue, jekyll-seo-tag outputs the following tags for this page:

<meta property="og:type" content="article" />
<meta property="article:published_time" content="2025-04-29T18:20:50+00:00" />

where this date is the last time the website was built. This behaviour is not desirable - I wish for those pages to be of a type "website", much like all pages built from pages, and with no dates specified.

CookiePLMonster avatar May 01 '25 16:05 CookiePLMonster

Pinging @parkr for comments.

ashmaroli avatar May 01 '25 17:05 ashmaroli

Is the specific problem that a Document without a date gets property="article:published_time" instead of property="website:published_time"? should we just omit this field if no date exists instead?

parkr avatar May 07 '25 20:05 parkr

Is the specific problem that a Document without a date gets property="article:published_time" instead of property="website:published_time"? should we just omit this field if no date exists instead?

No, it's kind of the opposite of that. The issue is that there is no way to opt-out of jekyll-seo-tag from setting the content type to "article" if a page has a date, and collections always get dates. If you check the now-closed Jekyll issue linked in the opening post, I initially thought collections always getting dates is a bug, but seemingly it's not.

CookiePLMonster avatar May 07 '25 20:05 CookiePLMonster

The specific problem (as far as jekyll-seo-tag and this issue ticket) is that jekyll-seo-tag considers all resources with date metadata (i.e. data["date"] != nil || false) as an "article" even if the said resource is not a blog-post (typically considered as articles). Consider:

  • Jekyll::Document [...] collection=posts with data["date"] automatically populated by Jekyll
  • Jekyll::Document [...] collection=movies with data["date"] automatically populated by Jekyll
  • Jekyll::Page [...] with user-provided date: #<Date ...> in data hash.

IMO, the best solution would be to have this plugin check if current page is actually a blogpost and then consider it as an "article".

ashmaroli avatar May 08 '25 07:05 ashmaroli

I just discovered that this defect has another consequence - because of the same problem, JSON-LD entry classifies all those pages as BlogPosting: https://github.com/jekyll/jekyll-seo-tag/blob/d61a2a84b885ea2e406e2d4c84fbcd37243bd0e7/lib/jekyll-seo-tag/drop.rb#L125-L137

CookiePLMonster avatar Jul 25 '25 11:07 CookiePLMonster