FreeBSD backend: Implement packageForFile()
As it turned out in https://github.com/ximion/appstream-generator/issues/111 running appstream-generator on an already built repository is too time consuming - it requires unpacking each package to scan its contents, which might be very slow for some packages.
This change allows for a different workflow. During the package building, the appstream-generator process-file is called for each package, right before the archive creating step. This command actually gets feed not a file, but a directory that represent the to-be-created package contents. This allows asgen to process the data blazing fast.
At the end of the package building, I run appstream-generator publish to actually create the AppStream metadata files.
This will cause issues, as asgen will not be able to clean up data properly - it will likely just delete absolutely everything that was handled by process-file, as that is more a debug tool for quick testing rather than something to be used in production. Using it for all packages will likely also break icon search.
Why is processing the entire archive slow? The contents of packages have to be scanned regardless, and once the archive has been scanned initially, subsequent scans will only happen for new packages, as both the contents of existing packages as well as their metadata are cached.
This will cause issues, as asgen will not be able to clean up data properly - it will likely just delete absolutely everything that was handled by process-file
Yes, I haven't looked into cleaning yet. I'll study the code that decides what should be cleaned and what should stay.
Using it for all packages will likely also break icon search.
Can you please elaborate on that?
Why is processing the entire archive slow?
It is talked about in https://github.com/ximion/appstream-generator/issues/111
The contents of packages have to be scanned regardless, and once the archive has been scanned initially, subsequent scans will only happen for new packages, as both the contents of existing packages as well as their metadata are cached.
This is still unsatisfactory on the large scale. FreeBSD Ports collection contains almost 35.000 ports ATM, which roughly maps to 35k packages. The initial scan is taking very long time for me and subsequent runs would be totally unpredictable. A little version bump of some dependency might result in rebuilding some heavy-weight package (like Stellarium from the referenced issue). Finally, it just feels a waste to operate on archives while we have possibility to operate on the plain filesystem.