flatpak-external-data-checker icon indicating copy to clipboard operation
flatpak-external-data-checker copied to clipboard

Support rewriting scraped URLs

Open wjt opened this issue 6 years ago • 3 comments

Anydesk is checked by scraping their human-readable download page. This links to a URL of the form https://download.anydesk.com/linux/anydesk-$VERSION-$ARCH.tar.gz.

When a new version is released, those files disappear, rendering the Flatpak non-installable. However, older (and current) versions remain available at https://download.anydesk.com/linux/generic-linux/anydesk-$VERSION-$ARCH.tar.gz.

We could teach htmlchecker to be able to mangle URLs. What I had in mind was something like:

"x-checker-data": {
    "type": "html",
    "url": "https://anydesk.com/en/downloads/linux",
    "url-pattern": "(https://download.anydesk.com/linux/anydesk-([0-9.]+)-amd64.tar.gz)",
    "version-pattern": "https://download.anydesk.com/linux/anydesk-([0-9.]+)-amd64.tar.gz",
    "url-substitution": ["/linux/", "/linux/generic-linux/"]
}

and the checker would just use str.replace(url_substitution[0], url_substitution[1]).

An alternative that would work in this case would be to instead scrape the index at https://download.anydesk.com/linux/, find all matches of the url-pattern, and take the one with the larger version number.

wjt avatar Nov 02 '19 16:11 wjt

Are there any updates on this issue regarding its implementation? This feature would be really useful even in a slimmed down variant that checks only a directory of an ftp server for a newer version.

jakobjakobson13 avatar Nov 28 '19 12:11 jakobjakobson13

@jakobjakobson13 pull requests very welcome!

wjt avatar Nov 28 '19 14:11 wjt

I think we can now cover the Anydesk case using the feature added in #46.

wjt avatar Dec 19 '19 13:12 wjt