Improve Unicode script (#881)
Overhauled the script to extract all available revisions for each of the standards, so it is possible to link to a specific one.
Now also the main URL for all Unicode standards now point to the latest live on their website.
The update drops rawDate at the root level. Now, I realize that SpecRef is somewhat inconsistent there: the date is always set for W3C entries (to the date of the latest version), sometimes for entries with versions in biblio.json, and never for WHATWG entries. Given that Unicode specs are not updated on a continuous basis, I would report the last date to the root level as well, so that specs that care about dates can display the date when they reference the spec.
I can add that no problem, but I'm not sure if it's a good idea in general for versioned entries when not referencing any version in particular?
If they want to explicitely state the last version they checked for compatibility alongside its date, they can now reference a particular version.
For a non-specific version, however, the date would cause the documents referring to it to also change the date any time they're recompiled, even if the writer has not actually checked the newer version to be fully compatible with the documentation.
For example, UTS46-33 made some changes in the processing that were not covered in the WHATWG URL specs at the time, and needed some changes (https://github.com/whatwg/url/issues/836). With the date there, any recompilations of the WHATWG URL document between the new UTS46-33 and ammending of the WHATWG URL standard, would cause the date to be also updated, incorrectly implying UTS46-33 changes were already taken into account.
IMO if they want to specify a non-specific version with a check date, that should be manually stated by the writer, as the compilation time will be later than the time they've checked it, and the refDate at the root level could be different.
I think we should leave the date for the reason @tidoust mentioned here:
I guess the argument goes both ways. That is, without any mention of date, you also imply that the latest version you're going to get when you retrieve the URL was the one taken into account. That's what you get when you choose to reference "the latest version of a spec". With a date, you could at least theoretically speaking spot the fact that the document you're referencing has changed when you re-build your spec.
Would it make sense to do that outside the extraction script, though? Sort of max([c.refdate for c in versions]) so it's consistent for other sources, as currently that is not the case as @tidoust said for W3G standards.
Possibly. Would argue doing this in a separate PR, though
I guess that could be done in https://github.com/tobie/specref/blob/main/lib/bibref.js#L263-L270, eg if parent.rawDate is null check if latest isn't and copy it.