New implementation of dependecy resolution for bionic packages
Goals of new buildorder.py:
- Fix Circular dependencies with
metaandsubpackages - Make full build order work
- Remove use of
ANTI_DEPENDS - Reduce unnecessary downloads in
ONLINE(-I) mode
Currently only single cycle exists which is really a cyclic dependency:
valac (it's subpackage valadoc) --> graphviz --> librsvg --> valac
This implementation significantly reduces the number of packages downloaded in CI. Here is an example:
Current implementation says that 174 packages needs to be
downloaded in order to build `icewm` but in reality just 81 is needed.
Closes: #19779 Closes: #20338
TODO
- [ ] Fix static package resolution in
-imode - [ ] Fix greedy match in
re.sub
Hey @termux/termux-packages-maintainers, please provide your insights and decide whether seperation of bionic and glibc dependency resolution is acceptable to you or not.
Personally, I do not wish to maintain code we aren't concerned with.
glibc-packages are in their own separate repository, so I agree that buildorder.py for glibc should be a separate file so that we can maintain buildorder.py for bionic separately, but it's ok for buildorder.py for glibc to stay in this repository and not the other repository because there are plenty of other places in this repository where glibc is handled, and glibc maintainers can edit it here along with the other code blocks that are glibc-specific here.
I would like to suggest the names buildorder_bionic.py and buildorder_glibc.py a little bit more than the names bionic_buildorder.py and glibc_buildorder.py, I believe they would be a little bit easier to search for, but it's up to you, if you prefer the names bionic_buildorder.py and glibc_buildorder.py then you can keep them named that.
I’d prefer not to see the implementations for bionic and glibc diverge, since maintaining separate branches would make things harder to keep in sync. I’d also like the improvements from #24698 to be taken into account, at least partially, since that could save us a lot of CPU time. Thanks.
I’d prefer not to see the implementations for bionic and glibc diverge, since maintaining separate branches would make things harder to keep in sync.
Personally I think it's ok for the glibc-specific buildorder.py to be out of sync with the bionic buildorder.py, because the current buildorder.py is never tested on glibc-packages while the development of bionic packages in this repository is occurring, i.e. while we develop bionic packages, we run buildorder.py many times on them without having the glibc-packages repository present either locally or in CI, and every time it runs, it evaluates a dependency tree of all bionic-packages in this repository, but does not evaluate any glibc-packages until the glibc maintainers actually download the glibc-packages repository locally and develop in it themselves. Therefore, any glibc-specific codepaths in the current buildorder.py are very rarely, or never tested by maintainers of only the main termux-packages (bionic packages) repository.
Oh, actually it seems like you took some parts of #24698.
Personally, I do not wish to maintain code we aren't concerned with.
What do you mean by codes that don't concern you?
valac (it's subpackage valadoc) --> graphviz --> librsvg --> valac
According to my calculations, when I consider all code in the termux project and apply my opinion to the situation, I believe that valac actually does not need to be specified as a build dependency of librsvg because it builds successfully without errors in all normally-supported codepaths when valac is removed from TERMUX_PKG_BUILD_DEPENDS of librsvg, and the reason is because valac is already in both scripts/setup-termux.sh and scripts/setup-ubuntu.sh (and it is good that it is already in both, and not only one, because those two scripts should be synchronized as much as possible),
we can think of those as "pre-build-dependencies", i.e. any build-only dependencies listed in them are considered to be mandatory prerequisites for building any and all termux-packages both on-device and in the termux-package-builder, and should actually not be specified unnecessarily in TERMUX_PKG_BUILD_DEPENDS of any packages that there isn't a special exception for for some reason, and therefore they do not need to be considered at all in Termux's buildorder.py or other dependency calculations, as they are considered to be already handled by setup-ubuntu.sh and setup-termux.sh.
https://github.com/termux/termux-packages/blob/94d4d3be9ddf45329c3c4429cae000f6fba19a26/scripts/setup-ubuntu.sh#L83
https://github.com/termux/termux-packages/blob/94d4d3be9ddf45329c3c4429cae000f6fba19a26/scripts/setup-termux.sh#L46
(@Maxython this might be a good time for me to mention that setup-termux.sh is currently a separate file from setup-termux-glibc.sh, which to me seems to support my opinion that in a similar way, the overall dependency calculations of glibc-packages in other places should also be separate files and would actually be more easily maintainable as separate files than together, particularly because these calculations involving deciding whether a package needs to be specified explicitly in TERMUX_PKG_BUILD_DEPENDS or not are definitely influenced by the contents of that setup-termux.sh)
As a result, after performing the above calculations of considering the status of valac in setup-ubuntu.sh and setup-termux.sh, I have tested the following change and locally confirmed that librsvg still builds successfully in a clean termux-package-builder container after applying this:
--- a/packages/librsvg/build.sh
+++ b/packages/librsvg/build.sh
@@ -7,7 +7,7 @@ TERMUX_PKG_SRCURL=https://download.gnome.org/sources/librsvg/${TERMUX_PKG_VERSIO
TERMUX_PKG_SHA256=bc1bbcd419120b098db28bea55335d9de2470d4e6a9f6ee97207b410fc15867d
TERMUX_PKG_AUTO_UPDATE=true
TERMUX_PKG_DEPENDS="fontconfig, freetype, gdk-pixbuf, glib, harfbuzz, libcairo, libdav1d, libpng, libxml2, pango"
-TERMUX_PKG_BUILD_DEPENDS="g-ir-scanner, valac"
+TERMUX_PKG_BUILD_DEPENDS="g-ir-scanner"
TERMUX_PKG_BREAKS="librsvg-dev"
TERMUX_PKG_REPLACES="librsvg-dev"
TERMUX_PKG_VERSIONED_GIR=false
With that explanation out of the way, I should mention that this PR currently would fail to build librsvg in a clean container because of this error happening to meson:
Pkg-config error with 'cairo': Could not generate cflags for cairo:
Package xproto was not found in the pkg-config search path.
Perhaps you should add the directory containing `xproto.pc'
to the PKG_CONFIG_PATH environment variable
but this is in fact, in my opinion, not a real problem with this PR itself, but is actually this buildorder.py rewrite revealing that there is actually a different dependency that belongs in the TERMUX_PKG_BUILD_DEPENDS of librsvg instead, xorgproto, which could then be added like this to resolve that build error:
--- a/packages/librsvg/build.sh
+++ b/packages/librsvg/build.sh
@@ -7,7 +7,7 @@ TERMUX_PKG_SRCURL=https://download.gnome.org/sources/librsvg/${TERMUX_PKG_VERSIO
TERMUX_PKG_SHA256=bc1bbcd419120b098db28bea55335d9de2470d4e6a9f6ee97207b410fc15867d
TERMUX_PKG_AUTO_UPDATE=true
TERMUX_PKG_DEPENDS="fontconfig, freetype, gdk-pixbuf, glib, harfbuzz, libcairo, libdav1d, libpng, libxml2, pango"
-TERMUX_PKG_BUILD_DEPENDS="g-ir-scanner"
+TERMUX_PKG_BUILD_DEPENDS="g-ir-scanner, xorgproto"
TERMUX_PKG_BREAKS="librsvg-dev"
TERMUX_PKG_REPLACES="librsvg-dev"
TERMUX_PKG_VERSIONED_GIR=false
(most distros, for example Arch Linux https://archlinux.org/packages/extra/x86_64/libx11/ resolve this by simply making xorgproto a runtime dependency of many packages that get pulled in by most GUI libraries, like for example libx11, but for some reason in Termux we consider xorgproto to be a "build-only dependency" in all of those cases where Arch Linux does not, so to follow preexisting Termux dependency convention rather than exactly copying Arch Linux in every single decision, xorgproto would be in TERMUX_PKG_BUILD_DEPENDS of librsvg in this case as in other cases)
@MrAdityaAlok I continued testing and I can report what seems like might be an actual problem with the PR in its current form, but which might hopefully be possible to fix with relatively few changes, but I'm not sure.
This command:
scripts/run-docker.sh ./build-package.sh -I -f godot
will currently produce this error if this PR is used:
Exception: Cycle exists: ['speechd', 'speechd-data', 'speechd']
The presence of speechd-data in TERMUX_PKG_DEPENDS of speechd indicates that speechd-data is a runtime-only dependency of speechd, not a build dependency of speechd, because the current (hypothetically bumped, for example) version of speechd-data cannot exist before the building of the parent package speechd, so that situation should be automatically detected and speechd-data detected and calculated as a "runtime-only dependency" of speechd.
Also, more precisely, when building reverse dependencies of speechd in the mode which uses the -I argument, as shown above, speechd-data should be correctly calculated by buildorder.py as a TERMUX_SUBPKG_DEPEND_ON_PARENT=false package automatically, because when it is actually generated into a subpackage later, it is actually not marked as depending on its parent package, without TERMUX_SUBPKG_DEPEND_ON_PARENT=false needing to be explicitly specified:
https://github.com/termux/termux-packages/blob/f4e4b3148ce8dce9dad81d282d44dcee3563b50e/scripts/build/termux_create_debian_subpackages.sh#L106
this should mean that in the -I mode of downloading packages, when speechd is downloaded, speechd-data should be downloaded as a dependency of speechd, and not the other way around (because that would match how apt install speechd and apt install speechd-data both work on-device)
Do you know what the correct changes to support this test case might be?
@MrAdityaAlok I continued testing and I can report what seems like might be an actual problem with the PR in its current form, but which might hopefully be possible to fix with relatively few changes, but I'm not sure.
This command:
scripts/run-docker.sh ./build-package.sh -I -f godotwill currently produce this error if this PR is used:
Exception: Cycle exists: ['speechd', 'speechd-data', 'speechd']The presence of
speechd-datainTERMUX_PKG_DEPENDSofspeechdindicates thatspeechd-datais a runtime-only dependency ofspeechd, not a build dependency ofspeechd, because the current (hypothetically bumped, for example) version ofspeechd-datacannot exist before the building of the parent packagespeechd, so that situation should be automatically detected andspeechd-datadetected and calculated as a "runtime-only dependency" ofspeechd.Also, more precisely, when building reverse dependencies of
speechdin the mode which uses the-Iargument, as shown above,speechd-datashould be correctly calculated bybuildorder.pyas aTERMUX_SUBPKG_DEPEND_ON_PARENT=falsepackage automatically, because when it is actually generated into a subpackage later, it is actually not marked as depending on its parent package, withoutTERMUX_SUBPKG_DEPEND_ON_PARENT=falseneeding to be explicitly specified:https://github.com/termux/termux-packages/blob/f4e4b3148ce8dce9dad81d282d44dcee3563b50e/scripts/build/termux_create_debian_subpackages.sh#L106
this should mean that in the
-Imode of downloading packages, whenspeechdis downloaded,speechd-datashould be downloaded as a dependency ofspeechd, and not the other way around (because that would match howapt install speechdandapt install speechd-databoth work on-device)Do you know what the correct changes to support this test case might be?
Thanks! That was handled in the previous implementation of mine but I added wrong check in the New faster implementation commit. I'll fix that.
TERMUX_PKGS__BUILD__NO_BUILD_UNNEEDED_SUBPACKAGES will also need to be supported by buildorder script if we go with this implementation.
- https://github.com/termux/termux-packages/pull/24647#issuecomment-2859812965
Personally, I do not wish to maintain code we aren't concerned with.
What do you mean by codes that don't concern you?
Personally, I do not wish to maintain code that isn’t relevant to our scope.
By “code that doesn’t concern us,” I mean packages outside the Termux infrastructure (specifically, non-bionic packages).
If glibc packages are officially merged into termux-packages, I’d be more than happy to support them. But since you are currently the only one maintaining them, I don’t think it makes sense for me to handle code related to their support.
This is just my personal opinion. Others might differ.
By “code that doesn’t concern us,” I mean packages outside the Termux infrastructure (specifically, non-bionic packages).
This is an utterly meaningless justification for refusing to implement the features and capabilities of the original/full-fledged buildorder.py in your new buildorder.py format. The termux-packages repository officially supports building glibc packages, and although termux-packages only contains bionic packages, this does not justify splitting the architecture, giving packages different capabilities (which may cause problems and confusion in the future), especially in the issue of package dependency analysis architecture. If you want to rewrite buildorder.py to fix issues, then by all means, do what you want, but in doing so, you commit to preserving all the features and functionality currently present in the original buildorder.py (this obligation is unavoidable). In other words, you are not required to maintain the entire glibc-packages repository and its packages that you have never touched. The only requirement is that you preserve the features and functionality present in the original buildorder.py, provided your new buildorder.py does not replace these features and functionality. Your "I don't wish to" does not mean that the features and functionality are incompatible with your new buildorder.py. If you find it difficult or unclear how to implement all this in the new buildorder.py, then nothing prevents you from asking me for help in immediately implementing the features from the original buildorder.py.
If glibc packages are officially merged into termux-packages, I’d be more than happy to support them.
I have a more ideal proposal that would make the package builder "neutral." You're unhappy with the lack of glibc packages in termux-packages. As I understand it, this prevents you from testing glibc packages (and it doesn't matter that you can test your builder simultaneously in both the termux-packages repository and the glibc-packages repository). So, my proposal is to start putting the package builder in a separate repository, so that the package builder and its scripts are neutral to package formats. Bingo, now you can test bionic and glibc packages simultaneously without any problems, since you now have to clone all the necessary packages first. ~~Self-irony.~~
Either it's governed as part of the Org and we maintain it as such, which is not currently the case.
Or glibc-packages and termux-pacman are second party projects, like the TUR, and you need to make sure you have what build infrastructure you need for it in the separate project.
We can certainly make some accommodations to make things easier for you. If you need support for Glibc in the build order here then you will need to explain what you need, how the build system for glibc packages currently works and what we can do to make sure things work how they should for you.
This isn't a matter of anyone being against Glibc support, it's your project and you've been mostly running it yourself separately from the Org. It's a matter of unfamiliarity. We don't know what you need in terms of support in this repo and you're the person with the experience to explain it.
@Maxython It's not that I do not want to implement it. It's just that this repo isn't concerned with your project. The build system we maintain should only be designed for our purpose or can accommodate changes that allows others to implement their own. We shouldn't implement the needs of everyone.
Understand that you want me to accommodate changes for a packaging system we do not handle. Suppose someday someone else decides to support some other packaging format and they too want us to accommodate their changes. It is not feasible and it will add unnecessary burden to us.
That said, I would like to propose that you keep a copy of the current build system in your project and only merge changes that concern your project. This way we both have less of a thing to worry about. Also, this will make your project independent of changes we do here.
I have a good idea, I will try to build some glibc-packages using this new buildorder.py without initially changing very much in either glibc-packages or the new buildorder.py, and I will report back what the first problem seems to me to be. That might help to narrow down the specific problems which would prevent the use of this buildorder.py as-is for glibc-packages.
It's not that I do not want to implement it. It's just that this repo isn't concerned with your project.
Why do you think the termux-packages repository is unrelated to the glibc-packages repository? I'll repeat that this is a false assumption, based on the lack of glibc packages in the termux-packages repository. The glibc-packages repository and the termux-packages repository are interconnected in terms of supporting the compilation of glibc packages in builder, since glibc-packages represents the necessary components used to build glibc packages through builder, and termux-packages represents the architecture (i.e., builder) that can work with components and allows for building packages (and those same components) in the same way as in the termux-packages repository.
The build system we maintain should only be designed for our purpose or can accommodate changes that allows others to implement their own. We shouldn't implement the needs of everyone.
Okay, I don't argue with you on this point. Just answer me one question: why is your purpuse more important than the issue of preserving capabilities and ensuring equality between building bionic and glibc packages? (a response like "termux-packages is not related to glibc-packages" is not accepted, because as I said above, this is not true)
Understand that you want me to accommodate changes for a packaging system we do not handle.
Which changes??? I just ask you to keep some of the functionality that is in the original buildorder.py in your new buildorder.py, and not to separate buildorder.py for bionic and for glibc, because this violates the overall concept of building a package.
Suppose someday someone else decides to support some other packaging format and they too want us to accommodate their changes. It is not feasible and it will add unnecessary burden to us.
What do you mean by "unnecessary burden to us"? It's just that, as someone (i.e., a termux and termux-pacman employee) who maintains pacman, glibc, and also ensures that previous implementations and features are compatible with new ones, I'm not familiar with that concept.
That said, I would like to propose that you keep a copy of the current build system in your project and only merge changes that concern your project. This way we both have less of a thing to worry about. Also, this will make your project independent of changes we do here.
I'm sorry, but I have to tell you no. I don't support the continued support of glibc and bionic packages in this format. So, expect me to make the necessary changes to your new buildorder.py that will ensure proper support for the package dependency analysis architecture.
Seems like @MrAdityaAlok has made sure building packages is still possible for bionic and glibc, so as I see it the PR does not break anything or make performance worse than today. Personally I never build for glibc, and I don't think we want to make it mandatory for package maintainers and infrastructure maintainers to always test their changes for both bionic and glibc builds, so the type of split as done here seems fine to me.
why is your purpuse more important than the issue of preserving capabilities and ensuring equality between building bionic and glibc packages
Capabilities are preserved here, right? Aditya has preserved current script for glibc to use.
I just ask you to keep some of the functionality that is in the original buildorder.py in your new buildorder.py
For someone who is not used to building glibc packages: which functionality are you referring to specifically that is missing in new buildoder.py, that glibc-packages needs?
expect me to make the necessary changes to your new buildorder.py that will ensure proper support for the package dependency analysis architecture
Sounds great, simplifying scripts and improving maintainability in termux-packages is what we all want, lets review that PR when it is ready. Would be great if glibc could use the new buildorder.py script as well, but it is probably best if the people that regularly build for glibc implements and tests it. Sounds like @robertkirkman has already started testing, so I'm sure we'll have something to review in the near future!
@Maxython I tried implementing support for glibc packages but due to many cyclic dependencies it fails. You need to fix these before we continue. I tested it with agnostic-apollo's implementation too but that also fails with the similar errors (when topologically sorting).
Cycles
gcc-libs
Cycle: ['gcc-libs-glibc', 'doxygen-glibc', 'gcc-libs-glibc']
Cycle: ['gcc-libs-glibc', 'libisl-glibc', 'libgmp-glibc', 'gcc-libs-glibc']
Cycle: ['gcc-libs-glibc', 'binutils-glibc', 'libbz2-glibc', 'bash-glibc', 'ncurses-glibc', 'gcc-libs-glibc']
Cycle: ['gcc-libs-glibc', 'binutils-glibc', 'libdebuginfod-glibc', 'libcurl-glibc', 'krb5-glibc', 'libverto-glibc', 'libevent-glibc', 'openssl-glibc', 'gcc-libs-glibc']
Cycle: ['gcc-libs-glibc', 'binutils-glibc', 'libdebuginfod-glibc', 'libcurl-glibc', 'libpsl-glibc', 'libidn2-glibc', 'gcc-libs-glibc']
Cycle: ['gcc-libs-glibc', 'binutils-glibc', 'libdebuginfod-glibc', 'libcurl-glibc', 'brotli-glibc', 'gcc-libs-glibc']
Cycle: ['gcc-libs-glibc', 'binutils-glibc', 'libdebuginfod-glibc', 'libelf-glibc', 'zstd-glibc', 'gcc-libs-glibc']
attr
Cycle: ['attr-glibc', 'gettext-glibc', 'libacl-glibc', 'attr-glibc']
glib
Cycle: ['glib-glibc', 'gobject-introspection-glibc', 'libgirepository-glibc', 'glib-glibc']
Note: Above cycles exists when building in -i mode. Many more exists when building without it.
like: python
❯ ./scripts/new_buildorder.py python-glibc
Cycle: ['python-glibc', 'llvm-glibc', 'python-glibc']
Cycle: ['python-glibc', 'llvm-glibc', 'libxml2-glibc', 'python-glibc']
Cycle: ['python-glibc', 'llvm-glibc', 'binutils-libs-glibc', 'libelf-glibc', 'libcurl-glibc', 'krb5-glibc', 'e2fsprogs-glibc', 'util-linux-glibc', 'python-glibc']
Cycle: ['libacl-glibc', 'attr-glibc', 'gettext-glibc', 'libacl-glibc']
Topological sorting was implemented to actually catch these real cycles so that they could be fixed and packages be buildable from scratch, like for bootstraps. I don't know the issues for glibc and why these cycles need to exist, and wasn't testing them.
@MrAdityaAlok I continued testing and I can report what seems like might be an actual problem with the PR in its current form, but which might hopefully be possible to fix with relatively few changes, but I'm not sure.
This command:
scripts/run-docker.sh ./build-package.sh -I -f godotwill currently produce this error if this PR is used:
Exception: Cycle exists: ['speechd', 'speechd-data', 'speechd']Do you know what the correct changes to support this test case might be?
@robertkirkman it's fixed now. You may test further.
I tried implementing support for glibc packages but due to many cyclic dependencies it fails. You need to fix these before we continue.
basically I know about that too, but I found them using a slightly different method. there is technically a way to find and debug them without actually editing the buildorder.py and just using the current one, so I am doing it that way first and when I finish I will make my PR to glibc-packages to fix everything.
Hey @twaik, I was trying to implement the parallelization you suggested (#24698). I have one question. Why kill all jobs even if single download failed? We would only be building failed downloads then why kill all? Maybe I'm missing something...
Because it was implemented for CI and in the case if single file is not available waiting for other jobs to be completed is pointless. Probably we can kill all ongoing downloads only in the casr if we are running in CI and go on downloading otherwise.
Now bionic_buildorder.py can correctly analyze the dependencies of all glibc packages during online compilation (i.e. when dependencies are installed, not compiled).
Now
bionic_buildorder.pycan correctly analyze the dependencies of all glibc packages during online compilation (i.e. when dependencies are installed, not compiled).
Instead of adding a new variable you could have just returned a VIRTUAL package when in OFFLINE mode and when only installing was set (I use that locally).
But anyway, that is not my point. This PR is to identify cyclic dependencies and fix them. Your check of if dep == self.root_pkg bypasses that. That is not the correct method. You should instead try to fix those cycles. Maybe @robertkirkman is in the process of the same.
Because it was implemented for CI and in the case if single file is not available waiting for other jobs to be completed is pointless. Probably we can kill all ongoing downloads only in the casr if we are running in CI and go on downloading otherwise.
No, I meant why can't we first download all packages and later build packages that failed to download, instead of killing the job in middle.
Instead of adding a new variable you could have just returned a VIRTUAL package when in OFFLINE mode and when only installing was set (I use that locally).
Thanks for the advice, I will study your method and if it works, I will use it.
Your check of if dep == self.root_pkg bypasses that.
This circumvention of circular dependencies partially makes sense because the (root) package where the circular dependency "begins" is guaranteed to compile. I'll note that I haven't fully explored the algorithms of your new buildorder.py yet, so I might change the way I analyze such packages.
This PR is to identify cyclic dependencies and fix them.
What do you mean by "fix them"? If you mean eliminating cyclic dependencies, then I have a question: do you have a roadmap for further development of your PR, and if so, what are the goals of that roadmap? I asked this question to properly understand your approach to cyclic dependencies and what you want to do about them.
You should instead try to fix those cycles. Maybe @robertkirkman is in the process of the same.
If @robertkirkman can find and remove useless circular dependencies in glibc-packages without limiting packages or making them more complex, that would be great. However, I'll say right away that he definitely won't be able to remove all circular dependencies, as some of them are necessary or very complex/large to fix.
This circumvention of circular dependencies partially makes sense because the (root) package where the circular dependency "begins" is guaranteed to compile.
How does it make sense and how is root package guaranteed to compile?
Suppose a package A that depends upon another package B which in turn directly or indirectly depends upon A then how is the compilation of A guaranteed here?
What do you mean by "fix them"? If you mean eliminating cyclic dependencies, then I have a question: do you have a roadmap for further development of your PR, and if so, what are the goals of that roadmap? I asked this question to properly understand your approach to cyclic dependencies and what you want to do about them.
Yes, I meant eliminate them.
I do not think any further roadmap is needed for bionic packages. Only one circular dependency exists (see first comment) and that is fixed now.
Cyclic dependencies are like chicken and egg problem. There will be a package from which everything is derived. Maybe that absolute package also depends upon some of the derived packages. So this can be resolved by first building the absolute package with some of it features disabled (called bootstrapping) and then building the fully functional package from it. This is generally how most compilers are compiled.
The following question arose: have you studied my PR #20513 in detail (about the implementations I propose to solve problems due to cyclic dependencies and how they work)?