Provide guidance on when to remove old entries from <releases>
This question came up in the context of maintaining gnome-software. Is there a point at which it’s OK to remove old entries from the <releases> list? Or to remove their release notes, but keep the <release> metadata (such as version number, release date, stable/unstable). Or is there some use case/consumer of the metainfo which needs to know historic release information?
From a maintainer’s point of view, carrying about old release information for longer than the software is supported is a bit of a waste of space and bandwidth (although it’s really only a tiny amount of space, and metainfo compresses really well). But maintainers of software don’t necessarily see the use cases of all consumers of metainfo, so that’s why I think it would be useful to provide some guidance about this in the spec.
I’m happy to put together a merge request if someone can answer my questions so I can be confident that what I’m writing covers all use cases!
At least the release dates seem relevant, as they affect the download numbers and are a significant trigger (or not).
So building something like a trending might be improved, if we take release dates into account.
Just for some added info, at elementary we usually keep the last 5 releases and then after that only keep the release metadata
AppStream's own tools to generate a metainfo file from a release YAML or NEWS file have a configurable limit, but default to keep the last 6 entries. The appstream-generator also includes the last 4-6 entries.
All of these limits are configurable however, to whatever upstream or downstream like best.
My point in the gnome-software change proposal is based on this:
$ ls -lh /usr/share/metainfo/org.gnome.Software.metainfo.xml
-rw-r--r--. 1 root root 198K Jan 14 01:00 /usr/share/metainfo/org.gnome.Software.metainfo.xml
aka the metainfo.xml has 198KB and it keeps growing. The file is stored uncompressed and the libxmlb is all but space saver. The only compression is within the distro metainfo file, like in case of the Fedora:
$ ls -lh /usr/share/swcatalog/xml/fedora.xml.gz
-rw-r--r--. 1 root root 7.1M Feb 13 01:00 /usr/share/swcatalog/xml/fedora.xml.gz
One point I did not mention, the libadwaita allows to create an about dialog from the metainfo data, but this metainfo needs to be saved in the executable, as a GResource, which is unsuitable comparing how much (less) data it reads from the metainfo to populate the dialog. That led me to the discover of the size of the file for the gnome-software.
From my point of view, I do not see why would any metainfo reader need to show 10 years of the release history, even if it's only a version and the date when it was done. Keeping there this information only, the gnome-software would still show it, as "empty" release information, without release description. See for example Firefox version history. It's useless for the users. if any automation relies on the old versions, it should not depend on a human-produced data, it should have its own update history log. The first page of the Appstream data documentation sounds like it's for the users, not for the machines (machines do good things with it, I know, I just mean the main output of the data is for the users/humans).
To the best of my knowledge, we only ever show latest on flathub.org and I can't remember anyone ever asking for more.
More might be helpful to track down bugs, but you can look at the changelogs in git hopefully.
Tangentially related, but: Since the AppStream 1.0 spec, you can actually have the release information in a separate file, see https://www.freedesktop.org/software/appstream/docs/chap-Metadata.html#tag-releases
So, you won't have a mega-large file (you'd have a smaller metainfo file, and a larger release file ^^).
But yeah, I don't think we should mandate a limit, but recommending a sensible amount of release entries somewhere is probably a good idea, so it doesn't get completely out of hand...
Interesting, the gnome-software does not support it (yet). I do not get from the link what the external releases file should look like. Is it just:
<releases>
<release..../>
</releases>
or
<component>
<id>my.app.id</id>
<releases>
<release .../>
</releases>
</component>
?
Interesting, the gnome-software does not support it (yet).
Sometimes I forget that gnome-software uses libappstream in a really weird way and doesn't make use of many of its built-in facilities 😅
It's the former example with very strict naming of the files (/usr/share/metainfo/releases/%{cid}.releases.xml), here's the spec for it: https://www.freedesktop.org/software/appstream/docs/sect-Metadata-Releases.html
Sometimes I forget that gnome-software uses libappstream in a really weird way and doesn't make use of many of its built-in facilities
Right, I do not know why it's that way. It's truly odd.
Right, I do not know why it's that way. It's truly odd.
When GS was using appstream-glib, Richard Hughes wanted to replace it entirely with libxmlb and simplify as-glib or maybe even get rid of it. Implementing some stuff separately also made it a little easier for stuff that wasn't using AppStream, which was (and is) only Snap AFAIK. And I'd need to ask Richard, but I guess the whole thing was a bit more complicated (especially because AppStream data can exist as YAML too) and therefore GS got stuck mid-way in refactoring.
After switching to libappstream, things got a bit absurd, because AppStream had meanwhile switch to Richard's libxmlb internally, so now a lot of stuff an parsing was duplicated. IMHO GS would benefit a lot from just using libappstream fully, like Discover does, instead of operating on libxmlb structures directly. But that would take some developer time, which is why I guess nobody has done it.
On the matter of this issue: Where should such advice even go? In the spec page for the releases XML? Into the quickstart guide? Or maybe just on something like the MetaInfo Creator site or Flathub advice pages?
Having a ton of release entries isn't really a problem for any of the tools, so IMHO it's really up to the individual projects what they want to support. The validator could emit a pedantic or info hint if the count gets really out of hand, like 100/200+ entries, but I'm not sure if that is even needed.