Git checker doesn't get timestamps, resulting in unwanted PRs
On a whim I wrote https://github.com/flathub/org.gnome.Boxes.Extension.OsinfoDb/pull/2 to automatically update the database used by GNOME Boxes when it is tagged upstream. When reviewing my own PR I noticed that the date added in the appdata is today's date, rather than the date attached to the tag.
It doesn't seem to be possible to ask git ls-remote to include more commit metadata. I guess the "easiest" thing to do would be to make a shallow clone of the repository at exactly the tag in question, then read the timestamp out of the tag; but it would be nice to avoid having to fetch the commit contents. I don't know Git well enough to know if there is an operation that can do this. (The GitHub API has a way! But that's not useful in cases, like this one, which aren't on GitHub.)
Sure it would be a nice feature to have. But I believe it won't make much difference in practice in most cases, because f-e-d-c on Flathub is likely to get the new tag about 1 hour after it was tagged, so the current date is likely to match the tag date anyway.
The name of the branches that f-e-d-c pushes are based on the tree hash, which will change whenever the date changes, so if a PR is not merged within the same UTC day as being opened, it'd open another one because the hash would change.
Maybe theoretical? Maybe acceptable?
Didn't think about it. Yeah, that would be not great.
Maybe we should add GitHub and GitLab checkers instead, that would use API?
Or even add a special case for git_ls_remote() helper - if the repo is hosted on GitHub or GitLab, use their APIs to get commit timestamp.
I really would like to avoid cloning whole repos. This would make checking git sources significantly slower.
That could work. The GitLab API for fetching a tag's metadata is unauthenticated if the repo is public:
$ http GET https://gitlab.gnome.org/api/v4/projects/gnome%2Fgnome-initial-setup/repository/tags/3.38.2
HTTP/1.1 200 OK
Cache-Control: max-age=0, private, must-revalidate
Connection: keep-alive
Content-Length: 1699
Content-Type: application/json
Date: Fri, 12 Mar 2021 08:45:04 GMT
Etag: W/"d7088e5a56582b820054cfdcea731b6d"
Server: nginx
Vary: Origin
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Gitlab-Feature-Category: source_code_management
X-Request-Id: 01F0JSRM9EBF100F4E4E05R8R2
X-Runtime: 0.044975
{
"commit": {
"author_email": "[email protected]",
"author_name": "Will Thompson",
"authored_date": "2020-11-21T15:08:26.000+00:00",
"committed_date": "2020-11-21T15:08:26.000+00:00",
"committer_email": "[email protected]",
"committer_name": "Will Thompson",
"created_at": "2020-11-21T15:08:26.000+00:00",
"id": "8aacbcb5913ef340fc6a78e6806177b23810766b",
"message": "NEWS for 3.38.2\n",
"parent_ids": [
"099f5f52535f22ab0fb63a74440561c506e2188e"
],
"short_id": "8aacbcb5",
"title": "NEWS for 3.38.2",
"web_url": "https://gitlab.gnome.org/GNOME/gnome-initial-setup/-/commit/8aacbcb5913ef340fc6a78e6806177b23810766b"
},
"message": "Git-EVTag-v0-SHA512: fab136661b6d1de47c8540773309664249fb43e4f301aaabdff7e889fb4c0ff1f9969cc61c892fd719c8439bbf3f786e4c9b1041443ebbc88689d25daa7e8281\n-----BEGIN PGP SIGNATURE-----\n\niQIzBAABCgAdFiEEHmjljPJViIMBZFtWNCLcDXrUgqcFAl+5LbAACgkQNCLcDXrU\ngqdRNA//WyWvz2XPrcLP7pCjwtV3nVVhWwnI1XEtlWUuYbFGQ3zZ2oIRZZcBwdlH\nmmgd8R8glDL1RAt7Pjor9CWSMmjqlK5eoT2IBCvNlMuQcmrTHGUoZ+uMWZKQYFXu\nGYfHkyF/1RwGWX4vDKW8P77grei7T507DQWBOwY1xEDllTlL0qAiAUzOHHTRFjNq\nHlpTahfuILyA5l/c08Wk/rPjhR+O1L3hBpZ8izdPbpd0yEUI8Bj4u/kt45jkVCRZ\nJSEtTRw1MWHbJtbkeW+Uw6IlLoHEAWVLcnFfw8YdLDXanoupzcLExmi8eUFPZMkO\n6br3s/ww3Ha19ZQuJdO1Ai/+kcsUxBIHwmhLPtbwjBx5TcpCggJIkWMfcZvvKWEB\nRlKrT3PWELAMD3eZg9zyZFUxsgPtTHwwD/LQLb7StyOJJ6Wl4O5/jbvKxoBWP+TM\nMJzB7hQlqrfhymhCkKCE7MOZV8fECreCc7BMeJpDC4vTdES4qVYgAlRkjNLa1DDB\n+jW0UDUcoyPdM3GIkb84uP/A5b7yNpGh49o6XMozQUkTWY1MqN5hSS37HZZ3xXPc\nZfhiODNQBnKf5x5ClE3xeLkpaIIt7z7DxfJenza3yFGh6tj5d/m5NXkIT603sY63\nqIEwtZtLVyJYYPh/RgP45CcW6pLY77pMiIv5ErVhfA4tzszFilA=\n=D9Dd\n-----END PGP SIGNATURE-----",
"name": "3.38.2",
"protected": false,
"release": null,
"target": "b873e05ad7c4c91ae690cdadb8f1779e42d3ee97"
}
GitHub's appears to need authentication but happily the script already knows how to do that.
(Was there a reason for shelling out to the git CLI rather than using a Python library OOI?)
(Was there a reason for shelling out to the
gitCLI rather than using a Python library OOI?)
Dunno. git command was used previously in main.py, so I just continued using it in utils.py.
if a PR is not merged within the same UTC day as being opened, it'd open another one because the hash would change
An alternate approach to prevent this would be to invent an algorithm to uniquely identify the update we made without relying on git commit hash to match.
And yet another idea for workaround - maybe add timestamp-query to JSONChecker? This would allow extracting timestamps from GitHub / GitLab APIs.
if a PR is not merged within the same UTC day as being opened, it'd open another one because the hash would change
An alternate approach to prevent this would be to invent an algorithm to uniquely identify the update we made without relying on git commit hash to match.
We have an in-memory representation of the manifest file(s) so I guess we could just hash those.
And yet another idea for workaround - maybe add
timestamp-queryto JSONChecker? This would allow extracting timestamps from GitHub / GitLab APIs.
Thanks for adding this in #155. It works nicely. I will leave this ticket open as a wishlist item – it'd be less obscure to have the Git checker do the right thing by itself. Maybe I'll have a crack when I get time.
This now also affects tarballs from GitHub, which now don't have persistent timestamps in HTTP Date header.