tech-docs-gem icon indicating copy to clipboard operation
tech-docs-gem copied to clipboard

Accept last_reviewed_on field as a string input

Open NickWalt01 opened this issue 3 years ago • 17 comments

What should change

Accept the use of last_reviewed_on field as a string input in any string variation; 2021/02/23, March 2027 23, January 13 2020, rather than a Date data type only. Whilst updating dependencies in the repositories listed in the next section I have found exceptions regarding the last_reviewed_on field as shown here. It appears that middleman expects the Date to be 2021/02/23 as highlighted on this page but the embedded ruby in this repository expects it to be 2021-02-23 as a Date object. As this field causes an exception it makes the title field fail causing the real error to be hidden by middleman asking for H1 tags when <%= current_page.data.title %> is used in the html file.

It currently fails here and the fix is to accept the string that is converted to a Date object:

def format_string_date(string_date)
   Date.parse(string_date)
end

User need

At MoJ we are using the tech-docs-gem in this repository to publish our template documents based on this template repository. We have many html pages that are utilising the tech-docs-gem to generate the html sites.

I have modified the tech-docs-gem ruby code and tested it works with both a string and a Date object.

NickWalt01 avatar May 04 '22 14:05 NickWalt01

Hi @laurahiles any idea how long it will take to implement the change? I can share the code and test if it helps

NickWalt01 avatar May 23 '22 14:05 NickWalt01

Thanks for raising this Nick. I've done a bit of digging, and to summarise the issue as I understand it:

  • In The Good Old Days, users of this gem had frontmatter like last_reviewed_on: 1970-01-01, which middleman loaded with load_file, resulting in {"last_reviewed_on" => #<Date: 1970-01-01...>}
  • ruby 3.1 switched to Psych 4.x by default, which changes the default for load_file from unsafe_load_file to safe_load_file, which means trying to parse last_reviewed_on: 1970-01-01 raises an exception (because Dates are not allowed)
  • If users of the gem instead try a different date format like last_reviewed_on: 1970/01/01 that parses as a string ({"last_reviewed_on" => "1970/01/01"}, but then this gem expects a Date so it blows up later
  • Given that middleman can't handle dates, and this gem can't handle strings, there's no way to construct frontmatter that works.

I have an untested hypothesis that this is fixed in middleman 4.4.3, because they switched to ::YAML.safe_load(..., permitted_classes: [Date, ...], ...) which should mean formats that parse as dates are allowed now (e.g. 1970-01-01).

You could still make a strong argument that this gem should support both strings and dates, since middleman's docs clearly suggest that weird stringy date formats are permitted.

richardTowers avatar Jan 10 '23 11:01 richardTowers

@LeePorte reports that the issue is still present with middleman 4.4.3 however.

richardTowers avatar Jan 10 '23 11:01 richardTowers

Hello @richardTowers thanks for having a look at this.

Your summary make perfect sense, I do remember the load_file to unsafe_load_file change causing an issue here.

The repo is using Middleman 4.4.2 at the moment, Gemfile.lock, so if it does get fixed in that dependency I can bump the version but we do have some other dependency issues in the Gemfile. The repo is using govuk_tech_docs (3.2.1). Due to these issues we paused updating the repo.

I do not have my branch with the fix anymore. Is it likely a fix will be applied?

NickWalt01 avatar Jan 10 '23 11:01 NickWalt01

It's at least on our radar now, I can't promise we'll get to it immediately though.

richardTowers avatar Jan 10 '23 12:01 richardTowers

I've raised https://github.com/middleman/middleman/issues/2614 with middleman. We should still consider supporting strings, irrespective of whether they decide to fix this.

richardTowers avatar Jan 10 '23 13:01 richardTowers

Ah, so has this broken building docs completely now?

lfdebrux avatar Jan 19 '23 11:01 lfdebrux

You could still make a strong argument that this gem should support both strings and dates, since middleman's docs clearly suggest that weird stringy date formats are permitted.

Could you point me towards where this is documented?

lfdebrux avatar Jan 20 '23 13:01 lfdebrux

Hi @lfdebrux my notes on the topic are here. I am not sure where the documentation is for Middleman. There is this page that states: "If you want, you can specify a full date and time as a date entry in the front matter, to help with ordering multiple posts from the same day. " and has the date as date: 2011/10/18

NickWalt01 avatar Jan 20 '23 14:01 NickWalt01

Hi @lfdebrux my notes on the topic are here. I am not sure where the documentation is for Middleman. There is this page that states: "If you want, you can specify a full date and time as a date entry in the front matter, to help with ordering multiple posts from the same day. " and has the date as date: 2011/10/18

Cool, thanks for the added context. Apologies, when I first looked at this ticket I thought it was solely a feature request, I didn't realise that the issue was preventing the docs from building the site even if the date was formatted correctly.

My main concern in allowing a string for the last_reviewed_on field is that it might mean we get the whole issue with ambiguously written dates; Ruby's Date.parse seems a little too flexible [1], it's not clear to me whether the result of parsing say 04/02/2020 is guaranteed to always be 4th February 2020, or if it is locale dependent. I also noticed that it says that 'if string does not specify a valid date, the result is unpredictable', which is worrying!

I think it might be safer to use Date.strptime(string, '%Y-%m-%d') instead, restricting the kinds of dates accepted by last_reviewed_date; does that sound reasonable @NickWalt01 @richardTowers?

lfdebrux avatar Jan 20 '23 15:01 lfdebrux

Hi @lfdebrux

Date.strptime("2022-08-22", '%Y-%m-%d') is accepted as a valid date, but Date.strptime("2022/08/22", '%Y-%m-%d') is not valid, so for me yes that is acceptable and I agree that it is better to restrict the date format that is used in the doc files to YYYY-MM-DD

NickWalt01 avatar Jan 23 '23 11:01 NickWalt01

I'd be fine with restricting dates to '%Y-%m-%d', but I think there's a tricky interaction here between ruby / yaml / middleman / tech docs gem.

The issue is that if you put a date in your frontmatter like:

---
date: 1970-01-01
---

Middleman will parse this YAML using YAML.load, which will now error in ruby >=3.1 because of the switch to safe_load. That's a bug in middleman in my opinion. Users of middleman that don't use the tech docs gem can work around this by using any format which ruby doesn't interpret as a date, but the tech docs gem actually wants a date so our users are between a rock and a hard place.

I don't think the tech docs gem gets involved early enough in the loading of a template to fiddle with the YAML frontmatter before it's parsed (but maybe it does?).

I guess we could also consider monkey patching middleman while we wait for a release... 🙈

richardTowers avatar Jan 23 '23 12:01 richardTowers

Looking at the Middleman commit history and the last tag release date, I can't see a new release soon, happy to wait for a new one as the docs work with the older version of Ruby, but 2.7 will be phased out soon:

Ruby 2.7 status: security maintenance release date: 2019-12-25

Ruby 2.6 status: eol release date: 2018-12-25 EOL date: 2022-04-12

NickWalt01 avatar Jan 24 '23 09:01 NickWalt01

Right, thanks for spelling it out for me @richardTowers. How frustrating.

lfdebrux avatar Jan 27 '23 14:01 lfdebrux

There's now a release of Middleman with support for Ruby 3.1 - v4.5.0. In theory, this bug should be fixed on middleman 4.5.0, but I haven't tested it yet.

richardTowers avatar May 04 '23 09:05 richardTowers

Hi @richardTowers thanks for the update, nice, we will give it a try with the new versions

NickWalt01 avatar May 04 '23 09:05 NickWalt01

There's now a release of Middleman with support for Ruby 3.1 - v4.5.0. In theory, this bug should be fixed on middleman 4.5.0, but I haven't tested it yet.

I ran bundle update on design-system-team-docs and it seems to have fixed the problem using YYYY-MM-DD. I had to lock the contracts gem at 0.16.1 because 0.17 errored, but all seems present and correct. Relevant changes in this PR.

stevenjmesser avatar Sep 04 '23 14:09 stevenjmesser