hjuutilainen-recipes
hjuutilainen-recipes copied to clipboard
URLTextSearcher failing in LibreOffice recipe
This morning I'm getting the following error from AutoPkg when running my LibreOffice recipe:-
No match found on URL: https://www.libreoffice.org/download/libreoffice-fresh/?type=mac-x86_64
The LibreOffice download recipe uses URLTextSearcher with the re_pattern string
?P<DOWNLOAD_URL>download.documentfoundation.org/libreoffice/stable/[\d\.]+/mac/x86_64/LibreOffice_(?P<version>[\d\.]+)_MacOS_x86-64.dmg
I downloaded the page at the url supplied to URLTextSearcher:-
https://www.libreoffice.org/download/libreoffice-fresh/?type=mac-x86_64
and did a search on the text using BBEdit and the essentials from the above regular expression, and the pattern matches several times in the page source, the first match being
download.documentfoundation.org/libreoffice/stable/6.2.0/mac/x86_64/LibreOffice_6.2.0_MacOS_x86-64.dmg
which looks fine to me. Not sure why URLTextSearch is failing.
The download page now redirects to https://www.libreoffice.org/download/download/ without preserving the ?type-mac-x86_64 query. Visiting without a user agent seems to default to MS Windows x86, i.e. the equivalent of visiting https://www.libreoffice.org/download/download/?type=win-x86&version=6.2.0&lang=en-US.
So, the search URL needs to be changed to https://www.libreoffice.org/download/download/?type=mac-x86_64.
This is addressed in #104 .
Looks like the text searcher is working. However, I noticed that it is always picking the first match which is at the time of writing release 6.2.2, which is marked as a pre-release.
From looking at the source of https://www.libreoffice.org/download/download/ it looks like 6.2.2 is just the first match for the regex.
Yep, I'm not sure what to do about this. The recipe works for now but it always grabs the "fresh" release. Their current download page doesn't use the fresh/still vocabulary at all to distinguish between the downloads and both of the current versions have stable in their URL.
Likewise. The only thing I can think of at the moment is writing a custom url provider which takes all the matching versions (three ad the time of writing) and return one of them depending on the argument set in the recipe.
However, that seems like an overly complicated solution.
Christian Lohmaier, a LibreOffice developer, suggested on freenode that this link would be more reliable http://download.documentfoundation.org/libreoffice/stable/
He said the lower version number should always be the still version.
So this problem is cropping up again with version 7.1.4 vs 7.1.3 being listed on the Release Notes. That said it does look like scraping http://download.documentfoundation.org/libreoffice/stable/ could be a valid and easy solution.
https://www.libreoffice.org/download/download/ does only have two versions listed and it would seem that still is always older than fresh. Either could provide a source for both streams.
Or I guess we could consider this the mostly_fresh stream since there's the delay between release and being posted on the Release Notes.