libappstream is not cross-platform anymore
Hello!
I am GIMP maintainer. We are using the AppStream format as the metadata format for upcoming extension format (extensions can be things like plug-ins, or again themes, or brush sets, etc.). Until recently we were using appstream-glib then we moved to libappstream, since the former is in maintenance mode.
Unfortunately the MSYS2 project just told us that libappstream is not cross-platform anymore since you just ported your code to libfyaml which doesn't support Windows! 😱 For us, you can imagine this is a huge blocker since GIMP is heavily cross-platform and our extension format is also obviously meant to work on Windows.
Are there any plans regarding this? Are you planning to port libfyaml to Windows so that libappstream re-becomes cross-platform? (and I am not mentioning other OSes like *BSDs or others. The repository I found only mention Linux and macOS, but I for sure hope this is meant to work everywhere).
Thanks!
Upstream issue: https://github.com/pantoniou/libfyaml/issues/10
Oh, interesting... I didn't even know libappstream already worked properly on Windows, I thought https://github.com/ximion/appstream/pull/541 was needed for that (unfortunately, no reply on that PR for a while, so it will likely be closed soon as it's stale).
There isn't anything inherently non-portable in libfyaml, in fact, it has been ported to Windows: https://github.com/pantoniou/libfyaml/pull/67 So I guess that patch would need feedback addressed and be merged. Meson support apparently was also something thought about at libfyaml, which would likely also help.
The state of YAML C libraries is pretty sad unfortunately, libyaml itself doesn't receive much new development and leaves a lot of optimization potential on the table. Libfyaml has a much better API, performance advantages, YAML 1.2 support, much better control over the generated output, etc. However, even that library doesn't seem to have the same level of support it once had.
Either way, adapting the Windows support patch for libfyaml would already help, @pantoniou's feedback on that PR also makes a ton of sense to me and would shrink down the patch volume a lot (by not having API changes and redefining alloca on Windows (I had no idea that Windows doesn't have alloca! Wow...)
In any case, I'll help in ways I can (my Windows knowledge nowadays is rather limited, it has been a long time since I built anything specifically for that platform). Reaching out to the libfyaml maintainer may be useful though, there's quite a few PRs that would make sense to get merged...
What I'd rather want to avoid is porting to a different YAML library, as fyaml enabled quite a lot of code cleanup and simplifications in appstream.
I wonder if it would be possible/a good idea to make appstream build without yaml support as a temporary solution. As far as I see Gimp uses only the xml parts.
The only other downstream user we have is libadwaita, which as far as I see supports yaml, but doesn't use the APIs directly and the examples only use xml (speaking of xml, the C library state is also sad :) )
That's not possible, YAML support is deeply interconnected in the code with the XML functions, used for JSON parsing as well as other API features such as NEWS file conversion as well as machine-readable validator output functions and other stuff like that.
speaking of xml, the C library state is also sad :)
Yes indeed - hopefully we can get a solution for that... It's the age-old classical issue of foundational open source projects not receiving proportional support.
Thanks. I'll help however I can too. A few years ago, I would have just have contributed the patch (I contributed to make appstream-glib build on Windows too, many years ago!), but these days I have too much to do and I'm tired! 😛
Also I don't use Windows either (again, I was still going the extra-mile, cross-building for Windows back then, but these days others do this just as well!).
Anyway I was hesitating suggesting making YAML support an option too. Too bad it's not possible. 😅
We'll be looking forward any unblocking on this topic in any case!
Anyway I was hesitating suggesting making YAML support an option too. Too bad it's not possible. 😅
Yeah, it would break the API hard, but more importantly, it would be an insane amount of work, I think way more work than just getting libfyaml to build, especially since a preliminary patch with feedback already exists for the latter.
I do have to say, while Windows isn't a priority for the library, it is also not ignored - I'll try my best to help the GIMP. Also, in libappstream, there are Linux-only features, but those are mainly in the parts of the code where hardware support is tested for, and those are used in software-centers on Linux. Which isn't a concern (yet...) on Windows (and currently we can emit an error on Windows explaining this, which is a very safe fallback path).
Huh I didn't know libappstream used libfyaml.
For what is worth, libfyaml is not abandoned or anything.
I still actively develop it, but main development happens in a different branch, gearing up for a 1.0 release.
Yes, the windows issue is known, but I am not a windows developer, and it would take some amount of effort to port it. I would be open to reviewing a PR that would provide support for it but has to go over a proper review first.
Huh I didn't know libappstream used libfyaml.
It only does since last month, replacing libyaml. Thanks for writing libfyaml, support for YAML 1.2 was really needed, and with fyaml we can also avoid copying memory three times for no reason. AppStream loads extremely large files on some systems, so this actually matters (although the duplicate-key detection slows down speed a lot, and YAML document emission is slower than with libyaml - but not by much).
For what is worth, libfyaml is not abandoned or anything.
That's great to hear! In hindsight, I should probably have reached out to you first, because AppStream depending on libfyaml definitely means increased scrutiny for the latter, as it will become part of the core/desktop package set in many distros and part of runtimes. So, a heads-up might have been nice ^^
In any case, it's for sure better than libyaml, which looks pretty abandoned (nearly no commits, zero feedback on PRs, ...).
On the Windows front, there were already people willing to work on it and submitting PRs! So, I'm not too worried, that's a really good sign :-)
@ximion Interesting, I know people keep using libfyaml but unfortunately no feedback. People only show up when something is broken :)
Well, if you do load large files you might like what's coming up...
Regarding windows, it is what it is, that platform is a pain in the neck, and just doing things for fun is impossible on it. If anyone is willing to go through the trouble and make it work, I am certainly open to merging it, but, has to be done properly...
@ximion err, also have you seen this? FYPCF_ALLOW_DUPLICATE_KEYS ?
@ximion err, also have you seen this? FYPCF_ALLOW_DUPLICATE_KEYS ?
I did, I did a fair bit of performance profiling (libfyaml was insanely slow at reading data by comparison at first, until https://github.com/ximion/appstream/commit/d4632d8a4a50d4054a9bb5eaffdfb7cc25dfe7c4).
Interesting, I know people keep using libfyaml but unfortunately no feedback. People only show up when something is broken :)
Hehe, classic experience for an open-source project maintainer. With some luck though, they'll show up with a patch out of nowhere that some company needed :-)
Windows is definitely the odd-one-out from all platforms one can support. But Cygwin and MinGW help a lot.
@ximion Seems like you're going to be heavy users and visible ones. I am quite open in your bug reports and profiling runs.
For instance, I know the emitter is not as fast as libyaml, but that's because the emphasis has been correctness and customizability.
Looks like you're at the other end of the spectrum.
If you have other insights please share.
Btw, the kind of large YAML files AppStream reads are these: https://appstream.debian.org/data/sid/main/
They are pure data, no anchors, no aliases, and a very defined structure. So, writing my own fast parser for a restricted subset of YAML was also an option once ;-)
For instance, I know the emitter is not as fast as libyaml, but that's because the emphasis has been correctness and customizability.
Jup, while evaluating libfyaml, I noticed that - BUT, YAML is read millions of times in libappstream and not generated very often (and it's generated on servers). So, that performance hit was acceptable, as long as we could parse YAML at rapid speeds. The YAML data is parsed on Debian/Ubuntu and all their derivatives every time new metadata is available in the distribution respositories, so being fast there is good. Also (and unfortunately) GNOME Software parses it too via libappstream, which affects startup time.
One thing I can report is that peak memory usage is down for parsing compared to libyaml, which is probably nice for smaller systems with less memory (the Raspberry Pi of the world, although even those have lots of RAM nowadays).
Btw, the kind of large YAML files AppStream reads are these: https://appstream.debian.org/data/sid/main/
Those are not even that large. I've been running tests with ones that go to gigabytes, parsing in a few secs... The APIs are different though, at that speed you can't build a document tree. I only have to spend some time going through the trouble of releasing after fixing the immense backlog...
They are pure data, no anchors, no aliases, and a very defined structure. So, writing my own fast parser for a restricted subset of YAML was also an option once ;-)
Lol, I was once there too. But I guess I'm crazier than you :)
Lol, I was once there too. But I guess I'm crazier than you :)
You were just faster, very much to my benefit now! 😄 Thank you!
Those are not even that large. I've been running tests with ones that go to gigabytes, parsing in a few secs...
The reason why you didn't get a patch or bug report yet is that libappstream parses stuff like that in 300 msec or less, and most of the time is spent on disk I/O anyway. It's not a massive delay. The other reason is that libfyaml enabled some improvements to YAML handling that we don't use yet, currently it's a pretty "close to the old libyaml interface" API, and there is for certain a lot that can be improved. Using more of libfyaml's zero-copy API would be an obvious task. While profiling, I didn't see an obvious bottleneck, so, if I want to parse faster, it will be smaller optimizations... (and it's already fast enough) Well, except for FYPCF_ALLOW_DUPLICATE_KEYS, that one was major ^^
(and the unicode bugfixes... it took me many hours to find out where the corrupted utf-8 was coming from and when it happened on some places but not others, ultimately tracing it back to libfyaml versions below 0.9, which just had a bug... Upgrading to 0.9 also sped up handling of localized texts slightly)
Actually I found this here because GIMP requires appstream; I just finished compiling libfyaml from source on Linux. This worked well enough.
I am trying to compile AppStream but run into: meson.build:16:8: ERROR: Problem encountered: Unable to find Docbook XSL stylesheets for man pages. No clue how to resolve this. (I mean, I can install docbook, but this thing was last updated when dinosaurs lived and the update instructions at LFS/BLFS don't work for me, so I decided to abandon the garbage that is docbook - sadly some projects think they still need to use docbook by default.)
In the old GNU configure days, things were easy; --disable-doc or something like that, shown via "./configure --help". I have no idea how to get this for cmake or meson. And as a result, even if it is not GIMP's primary fault, I have a hard time compiling gimp from source. I was then, as substitute, using an AppImage (GIMP-3.0.6-x86_64.AppImage) from archlinux. This also worked, until suddenly today it complains about "/home/Temp/.mount_gimpdaHJkp/100_0050.jpg". The latter is probably due to my own system not being set up 100% correctly, but why does gimp try to look for anything there, when the .jpg I am trying to open, is in the local directory?
I have so many questions here. Again, I understand that this is not GIMP's fault alone, but really I feel that the whole dependency management on Linux in general, has gotten much much worse in the last some years. I haven't had those issues in +15 years before, give or take. It is also much harder to find documentation these days, as google search is basically useless, so I guess I am not the only one finding such issues here. Hopefully one day the overall quality will improve, but it seems we do 4 steps forward, then 3 steps backward, and right now GIMP is in backward walking mode, even if the issue lies elsewhere. I feel that the long delay between publishing releases on the GIMP side, has really hurt the project.
I'd also like to suggest for libappstream to disable documentation in general - I never read local documentation anymore, online is the only thing I would ever want to look at, so defaulting to mandating it makes no sense IMO.
I have no idea how to get this for cmake or meson.
@rubyFeedback https://github.com/ximion/appstream/blob/main/meson_options.txt
meson.build:16:8: ERROR: Problem encountered: Unable to find Docbook XSL stylesheets for man pages. No clue how to resolve this.
You just need to place the stylesheets in the right directory, so manual pages can be created. Docbook is still used a lot, if you are building LFS, you compiled systemd which also uses it for man pages. You'll need to get the stylesheets from https://github.com/docbook/xslt10-stylesheets (or use the Debian package as reference). I think systemd nowadays ships with some of the XSLT needed directly in-tree, this may be an idea for AppStream as well...
Completely off-topic for this issue though, just file a new one for Linux builds (but there's nothing we can do for this one - except for maybe eventually shipping our own stylesheet, which I wouldn't be opposed to...).
If it is helpful, I also got blocked by libfyaml not being cross-platform, so I wrote a cross-platform YAML 1.2 parser & emitter to solve the problem https://github.com/andrewmd5/cyaml
Passes the test suite and has been through review and multiple rounds of fuzzing. Though it is C11 so not sure if that is a blocker.