Testing version doesn't sync with archive after migration
After dh-shell-completions 0.0.3 migrated to testing for a while (migrated on 22 Sep, problem found on 18 Oct), manpages.debian.org still has its testing version at 0.0.2. Now that 0.0.4 was uploaded and manpages.d.o version are now in sync (testing 0.0.3, unstable 0.0.4), I suspect that only uploads trigger updates, not migrations.
Hey, thanks for your report.
You’re right that something seems off here, but your suspicion is not correct: debiman does not know about uploads or migrations, it always goes through the list of packages currently in the Debian archive.
However, I think there is a bug in cache invalidation that I have now tracked down based on this timeline:
This is what the Debian package tracker lists:
- [2024-09-09] Accepted dh-shell-completions 0.0.2 (source) into unstable (Blair Noctis)
- [2024-09-16] dh-shell-completions 0.0.2 MIGRATED to testing (Debian testing watch)
- [2024-09-16] Accepted dh-shell-completions 0.0.3 (source) into unstable (Blair Noctis)
- [2024-09-22] dh-shell-completions 0.0.3 MIGRATED to testing (Debian testing watch)
- [2024-10-19] Accepted dh-shell-completions 0.0.4 (source) into unstable (Blair Noctis)
This is what the debiman logfiles say, annotated for clarity with the resulting state on disk:
TZ=Europe/Zurich journalctl --root=2024-09-19 --since 2024-09-15 -u debiman --grep dh_shell_completions | cat
# rendering both versions because 0.0.2 migrated to testing
Sep 16 05:03:05 ex622 run-debiman.bash[1701967]: 2024/09/16 05:03:05 render.go:296: /srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.html.gz invalidated by /srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.gz
Sep 16 05:03:05 ex622 run-debiman.bash[1701967]: 2024/09/16 05:03:05 rendermanpage.go:322: rendering "/srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.html.gz"
Sep 16 05:03:05 ex622 run-debiman.bash[1701967]: 2024/09/16 05:03:05 rendermanpage.go:322: rendering "/srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz"
# -rw-r--r-- 1 root root 2,0K 2024-09-09 00:38 testing/dh_shell_completions.1.en.gz
# -rw-r--r-- 1 root root 4,8K 2024-09-16 05:03 testing/dh_shell_completions.1.en.html.gz
# -rw-r--r-- 1 root root 4,8K 2024-09-16 05:03 unstable/dh_shell_completions.1.en.html.gz
# rendering both versions because 0.0.3 entered unstable
Sep 17 05:03:25 ex622 run-debiman.bash[1813534]: 2024/09/17 05:03:25 render.go:296: /srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz invalidated by /srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.gz
Sep 17 05:03:25 ex622 run-debiman.bash[1813534]: 2024/09/17 05:03:25 rendermanpage.go:322: rendering "/srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz"
Sep 17 05:03:25 ex622 run-debiman.bash[1813534]: 2024/09/17 05:03:25 rendermanpage.go:322: rendering "/srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.html.gz"
# -rw-r--r-- 1 root root 2,0K 2024-09-16 21:43 unstable/dh_shell_completions.1.en.gz
# -rw-r--r-- 1 root root 5,5K 2024-09-17 05:03 unstable/dh_shell_completions.1.en.html.gz
# -rw-r--r-- 1 root root 5,5K 2024-09-17 05:03 testing/dh_shell_completions.1.en.html.gz
# NOTE: The log for 2024-09-22 does not contain any mention of dh_shell_completions!
# most likely cause:
# 1. debiman extracts the manpage to testing/dh_shell_completions.1.en.gz with modtime 2024-09-16 21:43
# 2. because the mod time of the raw manpage (2024-09-16 21:43) is older than the HTML version (2024-09-17 05:03), debiman assumes the HTML version is up to date and does not need to be re-generated.
# rendering both versions because 0.0.4 entered unstable
Oct 19 23:03:32 ex622 run-debiman.bash[1822969]: 2024/10/19 23:03:32 render.go:296: /srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz invalidated by /srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.gz
Oct 19 23:03:32 ex622 run-debiman.bash[1822969]: 2024/10/19 23:03:32 rendermanpage.go:322: rendering "/srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz"
Oct 19 23:03:32 ex622 run-debiman.bash[1822969]: 2024/10/19 23:03:32 rendermanpage.go:322: rendering "/srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.html.gz"
This is the state on disk:
% ls -hltr /srv/man/www/unstable/dh-shell-completions/ && head /srv/man/www/unstable/dh-shell-completions/VERSION
total 20K
-rw-r--r-- 1 root root 2,0K 2024-10-19 16:38 dh_shell_completions.1.en.gz
-rw-r--r-- 1 root root 5 2024-10-19 23:02 VERSION
-rw-r--r-- 1 root root 3,5K 2024-10-19 23:03 index.html.gz
-rw-r--r-- 1 root root 5,5K 2024-10-19 23:03 dh_shell_completions.1.en.html.gz
0.0.4#
% ls -hltr /srv/man/www/testing/dh-shell-completions/ && head /srv/man/www/testing/dh-shell-completions/VERSION
total 20K
-rw-r--r-- 1 root root 2,0K 2024-09-16 21:43 dh_shell_completions.1.en.gz
-rw-r--r-- 1 root root 5 2024-09-22 05:00 VERSION
-rw-r--r-- 1 root root 3,5K 2024-09-22 05:03 index.html.gz
-rw-r--r-- 1 root root 4,8K 2024-10-19 23:03 dh_shell_completions.1.en.html.gz
0.0.3#
So the problem consists of multiple parts:
- We override the modtime of the raw manpage when extracting to what’s stored in the archive (i.e. the modtime of the uploader): https://github.com/Debian/debiman/blob/4afba3a1fef8fc4215d59c13926082ac1001a987/cmd/debiman/download.go#L409
- Invalidating other versions of a manpage updates the HTML modtime, but re-uses the content (as an optimization).
- This breaks the assumption we make here: If the HTML version is more recent than the raw manpage, it must also contain the contents of that raw manpage: https://github.com/Debian/debiman/blob/4afba3a1fef8fc4215d59c13926082ac1001a987/cmd/debiman/render.go#L250
So, what can we do to fix the issue?
- If we just delete the Chtimes call, the modtime would match the extraction time, which will fix the issue, but break the “Source last updated” line in user-facing HTML versions.
- Another way to fix the issue would be to stop relying on the mtime and instead inspect the HTML file to figure out which version is contained in the HTML. However, during my investigation I realized that we incorrectly store a reference to the version of the current package, even when incorrectly re-using older content: the footer says “dh_shell_completions.1.en.gz (from dh-shell-completions 0.0.3)”, even though the content is from 0.0.2. So we would first need to fix that.
- We could disable the content re-use optimization entirely. I haven’t checked how much slower that would make a typical run.
I’m not sure yet which path I like most. Maybe option 2 deserves a shot, and if it turns out to be too hard for some reason, we can resort to option 3.
I can kick off a run with a forced full re-rendering to get the current manpage archive fixed (will take a few days to complete and propagate, though).
I can kick off a run with a forced full re-rendering to get the current manpage archive fixed (will take a few days to complete and propagate, though).
Looks like this was a bit quicker than expected: the corrected version now seems to be live.
Thanks for the detailed analysis. Some rough thoughts:
- "Encode" package version info in HTML page creation time, and compare it with man page "last updated" time, leaving modification time free to change and represent modification of itself.
- Link
unstable/footofoo/$verwhere$veris current version in unstable, so file creation/modification times are decoupled from package versions. - Split sections whose changes do not depend on actual content, e.g. "other versions", footer timestamps, etc. into partials, and
<iframe>them into content pages.
Note there are two bug reports in Debian also on the topic of manpages.debian.org not getting updated:
- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=986030
- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994761
As a package example, git-buildpackage 0.9.36 had been in Debian unstable since Dec 22nd, but it did not update until today Dec 27th.
Note there are two bug reports in Debian also on the topic of manpages.debian.org not getting updated:
Thanks, but I don’t follow Debian mailing lists after having retired from the project in 2019: https://michael.stapelberg.ch/posts/2019-03-10-debian-winding-down/
The quickest way to report a bug with debiman is to issue this GitHub issue tracker.
As a package example, git-buildpackage 0.9.36 had been in Debian unstable since Dec 22nd, but it did not update until today Dec 27th.
I looked into it and this seems like a different issue. The file unstable/git-buildpackage/gbp-buildpackage.1.en.html.gz was generated on December 22nd, but apparently was not synchronized to Debian’s static page hosting infrastructure. adsb@ looked into it and fixed it AFAICT.
I’ll see if I can find some time to address the bug that this issue tracks.