conda-forge.github.io
conda-forge.github.io copied to clipboard
General setup of static outputs vs. shared ones
Static builds are already an exception in conda-forge (see CFEP-18), but some do exist for specific use-cases. The micromamba feedstock is a good example of a user of static libs (presumably to stay "micro" and have no runtime dependencies).
Several feedstocks of this kind follow a pattern as follows:
outputs:
- name: libxyz
build:
run_exports:
- {{ pin_subpackage('libxyz') }}
[...]
- name: libxyz-static
requirements:
build:
- [...]
host:
- {{ pin_subpackage("libxyz", exact=True) }}
run:
- {{ pin_subpackage("libxyz", exact=True) }}
This has some advantages & disadvantages:
- 👍
- deduplication of files between static & shared builds
- shared & static builds are co-installable
- I'd argue this is actually an anti-pattern, but it can become relevant for dependencies with run-exports, see below
- 👎
- libxyz-static pulls in both dynamic and static builds
- this makes it possible for static builds to "silently" depend on the shared builds
- in practice this is not a huge issue because run-exports from
libxzydon't get picked up though for a-statichost dep, so reliance on the dynamic lib would be detected by going boom at runtime
- CMake integration is impossible for both packages simultaneously
- either the targets are wrong for the non-static version ("libzstd.a" not found)
- or the CMake-specific files clobber each other
- libxyz-static pulls in both dynamic and static builds
The CMake issue in particular is quite painful, because an ever-increasing number of packages in C/C++-land come with built-in CMake integration. For example, recent LLVM-builds failed on https://github.com/conda-forge/zstd-feedstock/issues/58. There, I untangled the dependence in https://github.com/conda-forge/zstd-feedstock/pull/62, but this has the following trade-off:
[...] we can have only 3 out of the following 4 (AFAICT):
- working CMake integration
- no clobbering of CMake files
- no manual hacking (of CMake files resp. upstream CMake integration)
zstd&zstd-staticco-installableCurrently I've chosen to give up 4. - perhaps an argument can be made that giving up 2. is less harmful despite being against best practice.
The lack of being able to co-install libxyz and libxyz-static would become problematic in the following scenario (quoting @hmaarrfk):
- User builds package
a, which depends onzstddynamic.zstdexports a requirement ofzstd.- User tries to build package
bthat depends onaandzstd-static. This user, can no longer build their package.Package
bcannot depend on justzstd-staticbecause the dependency onzstdis controlled by packagea.
In this case, that PR was merged (and the existing consumers are not affected by the run-export-induced conflict between dynamic & static libs), but now I've encountered the same in libprotobuf, and fixing https://github.com/conda-forge/libprotobuf-feedstock/issues/68 is not possible without running into the same issue, hence I thought I'd open a wider discussion.
@hmaarrfk sketched out a different approach:
The other alternative, which I don't have time to flesh out, is to return to the previous system, but, to ensure that:
cmakeworks with the dynamic libraries.cmakefails to automatically report the static libraries.This would favor the dynamic library usage, instead of forcing users of cmake+zstd to have the static libraries installed.
to which I said:
Wouldn't that defeat the point of
zstd-static? If I use an output named like that, I'd certainly expect the static lib to be picked up, not the dynamic one (i.e. compiling againstzstd-staticwould seem to work, but would actually do the same as compiling againstzstd, i.e. create a runtime dependence onzstd)
Summary
- The setup currently used by several static outputs is incompatible with CMake, which is IMO a bigger downside than getting some file duplication and even co-installability
- In general, I think it's really iffy to have static and shared builds of the same libs in the same host environment.
- There are other feedstocks where this setup is not possible/sensible, e.g. https://github.com/conda-forge/abseil-cpp-feedstock/pull/35
Proposal: Forbid co-installation of libxyz-static builds with libxyz
This would solve the CMake issues (each output could have its own copy of the respective CMake files without risks of inconsistency or clobbering). The only downside would be that users of -static packages cannot depend on another package (say a) which has been built against a shared version of the same lib. Since most of the static libs live pretty close to the bottom of the stack, I don't think this would be much of a problem. And even so, there would be a solution: build a (perhaps also static?) version of a against libxyz-static.
I think (users of) static libs are special enough that we can inflict this (hypothetical!) pain on them
Thoughts @conda-forge/core?
Libraries should use a separate target name for their static and dynamic libs so the CMake files don't clobber each and there is no ambiguity about what is being linked?
https://stackoverflow.com/a/2152157 https://stackoverflow.com/a/29824424
Could also create more outputs of the recipe to avoid clobbering. i.e. run_export only libs (no headers or CMake files).
Libraries should use a separate target name for their static and dynamic libs so the CMake files don't clobber each and there is no ambiguity about what is being linked?
I agree with this, however, it doesn't match current widespread practice...
Could also create more outputs of the recipe to avoid clobbering. i.e. run_export only libs (no headers or CMake files).
Could you sketch your idea w.r.t. to the most important aspects of a meta.yaml that'd do what you propose?
In general, I think it's really iffy to have static and shared builds of the same libs in the same host environment.
Having static and shared builds in the same environment has been the standard in Unix environments for decades. I don't know what you mean by iffy here. Please understand that conda is not special and we should do well to learn from the past.
In the case of zstd, the way you fixed is not ideal. What you should do is first build the static build, install and then do the shared lib only. CMake configs will support only shared. In my experience, projects supporting both static and shared libraries at the same time with cmake is rare. Most projects can build only one of those at the same time.
In the case of micromamba, there is no mistake about using static libraries because it explicitly searches for .a files. In other build systems, they usually add -Wl,-Bstatic to ensure static libraries are searched first. In any case, packages that require static building does it carefully and their build systems are usually great at limiting to static builds.
In general, I think it's really iffy to have static and shared builds of the same libs in the same host environment.
Having static and shared builds in the same environment has been the standard in Unix environments for decades. I don't know what you mean by iffy here. Please understand that conda is not special and we should do well to learn from the past.
By iffy I mean:
- having wrong or broken CMake metadata
- having to cross your fingers (or be very careful) to get the right artefact picked up
- in principle, this would allow frankenstein-ian mixes of partially static / partially dynamic linking. If this only affects a handful of rarely used symbols, the corresponding runtime failure might not even be apparent immediately
I think a feedstock should make a specific choice of depending on the shared or static builds of a given dependency, and then only depend on that. Having both in the same environment makes it much harder to tell what's happening under the hood.
In the case of zstd, the way you fixed is not ideal.
Thanks for the feedback. Happy to find a better way to do it, which is why I opened this issue. Already in that issue I had sketched my
preferred solution
which would be:
| output | contains shared lib | contains static lib | comment |
|---|---|---|---|
libxyz |
✔️ | ➖ | not co-installable with libxyz-static |
libxyz-static |
➖ | ✔️ | not co-installable with libxyz |
instead of the previous (== state of other static builds in conda-forge):
| output | contains shared lib | contains static lib | comment |
|---|---|---|---|
libxyz |
✔️ | ➖ | |
libxyz-static |
✔️ (transitively) | ✔️ | run-depends on libxyz output; |
and the current (after https://github.com/conda-forge/zstd-feedstock/pull/62):
| output | contains shared lib | contains static lib | comment |
|---|---|---|---|
libxyz |
✔️ | ➖ | |
libxyz-static |
✔️ | ✔️ | independent of libxyz output |
What you should do is first build the static build, install and then do the shared lib only. CMake configs will support only shared.
IIUC correctly, only the shared builds would have CMake metadata. I dislike this because we'd be "lying" to any consuming feedstock that requests libxyz-static and happens to use CMake, i.e. it would be using the shared lib instead of the explicitly requested static lib.
In the case of micromamba, there is no mistake about using static libraries because it explicitly searches for .a files.
I had no doubt that it was being used correctly there, but IMO we should have more "guard rails" for using these static lib. Or if we decide not to, rename them to have a leading underscore so that they're clearly marked as "here be dragons".
In short, IMO:
- CMake integration should just work (for all cases)
- separation of shared/static builds leads to safer & more understandable recipes
- allowing mixing for the sake of "it was always done like this on unix" is not helpful IMO
- the "cost" of separating shared & static builds affects only "expert" feedstocks, it's not even clear that it would be an issue in practice, and there'd be a reasonable work-around.
IIUC correctly, only the shared builds would have CMake metadata. I dislike this because we'd be "lying" to any consuming feedstock that requests libxyz-static and happens to use CMake, i.e. it would be using the shared lib instead of the explicitly requested static lib.
We are not lying. there would be no libxyz-static CMake target. only libxyz-shared.
CMake integration should just work (for all cases)
You are going into hypotheticals. Please show one package where having only the shared build in cmake will fail when a project like zstd is built like how I said. Otherwise this conversation is moot.
We are not lying. there would be no
libxyz-staticCMake target. onlylibxyz-shared.
Having only the shared target is what worried me, because a feedstock using CMake & containing something like find_package(libxyz) in the upstream CMakeLists.txt, would find the shared builds, even though from the POV of conda & the meta.yaml, we'd be specifying libxyz-static as a host-dep.
You are going into hypotheticals. Please show one package where having only the shared build in cmake will fail when a project like zstd is built like how I said.
I gave two examples in the OP, zstd & libprotobuf (the latter has no cmake integration on unix yet). Lots of feedstocks that are consuming those are CMake-based. If any of those wants to use the static builds for whatever reason (and is not as exceedingly careful as micromamba), it would break. I think https://github.com/conda-forge/onnxruntime-feedstock would be a candidate (using CMake & depending on libprotobuf-static), for example.
Otherwise this conversation is moot.
I don't think this accurate (or fair). So far, the balance of pros/cons is IMO leaning in favour of changing (what are the benefits of the status quo, aside from not having to change something?); one more aspect against the status quo is that even your described solution has a higher integration cost because the build scripts between static/shared would become more involved (or requiring patches) to not pick up the static targets, whereas my proposal just needs cmake {,build,install}.
I gave two examples in the OP, zstd & libprotobuf (the latter has no cmake integration on unix yet). Lots of feedstocks that are consuming those are CMake-based. If any of those wants to use the static builds for whatever reason (and is not as exceedingly careful as micromamba), it would break. I think https://github.com/conda-forge/onnxruntime-feedstock would be a candidate (using CMake & depending on libprotobuf-static), for example.
No, zstd and libprotobuf are not examples. I mean downstream packages where this is needed. micromamba obviosuly doesn't care. So you only have the example of onnxruntime. Please go into detail about why onnxruntime depends on libprotobuf-static.
one more aspect against the status quo is that even your described solution has a higher integration cost because the build scripts between static/shared would become more involved (or requiring patches) to not pick up the static targets, whereas my proposal just needs cmake {,build,install}.
No, it doesn't. You just need cmake {build, install}. No patches necessary. You just need to make libzstd-static depend on libzstd in your solution.
No, zstd and libprotobuf are not examples. I mean downstream packages where this is needed.
I'm well-aware of that, I just said that any dependent package using CMake & wanting a static lib for any reason would be an example.
So you only have the example of onnxruntime. Please go into detail about why onnxruntime depends on libprotobuf-static.
I feel this is moving the goal posts; previously you asked what would break, now you ask why that feedstock needs a static lib. That's a deliberation per feedstock, I don't know this case specifically, or if it could be removed now; but that's besides the point, because it's an example of what I was describing.
No, it doesn't. You just need cmake {build, install}. No patches necessary.
Could you sketch how to do this based on - for example - the zstd feedstock? I'm not sure how one would build the static lib (currently just using cmake and ZSTD_BUILD_{STATIC,SHARED}=ON/OFF) but make sure the corresponding targets don't get installed, or at least I can't think of a way that doesn't involve patching.
You just need to make libzstd-static depend on libzstd in your solution.
How would the CMake files of zstd-static not clobber those of zstd? (though I also did say in the OP that accepting the clobbering might be the least-bad trade-off)
I feel this is moving the goal posts; previously you asked what would break, now you ask why that feedstock needs a static lib. That's a deliberation per feedstock, I don't know this case specifically, or if it could be removed now; but that's besides the point, because it's an example of what I was describing.
Since onnxruntime works just fine now, why are you insisting that you need cmake files to mention static just for onnxruntime to build correctly?
Since onnxruntime works just fine now, why are you insisting that you need cmake files to mention static just for onnxruntime to build correctly?
It currently uses the static lib, but does that without being confused by shared-only CMake files, because those don't exist yet due to https://github.com/conda-forge/libprotobuf-feedstock/issues/68 (that issue needed upstream fixes first which have landed now, but adding the CMake integration in conda-forge for libprotobuf would run into the issue I'm describing here).
But zooming out a bit - why is it controversial that projects might want to consume static libs through CMake? That's a pretty normal thing to be doing (granted, not in conda-forge due to CFEP-18, but the exceptions that exist shouldn't have to be barred from using CMake?)...
Regarding examples, https://github.com/conda-forge/grpc-cpp-feedstock having to consume libabseil-static (on windows, due to C++ ABI issues) through CMake is also one.[^1]
[^1]: I accept that abseil is obviously special through its ABI-dependence on the C++ version used to compile[^2], and doesn't need to fit a general scheme, but it does happen to also fit the proposed "separation between static and shared outputs".
[^2]: Therefore, we cannot have more than one shared lib (e.g. C++17), but need several flavours of static libs (at least C++11/C++14) so that feedstocks can use a compatible ABI for the C++ version they're using to compile themselves.
It currently uses the static lib, but does that without being confused by shared-only CMake files, because those don't exist yet due to https://github.com/conda-forge/libprotobuf-feedstock/issues/68 (that issue needed upstream fixes first which have landed now, but adding the CMake integration in conda-forge for libprotobuf would run into the issue I'm describing here).
So, you are saying that it links statically right now without issue and as soon as shared only CMake files are added, it will link to shared library?
why is it controversial that projects might want to consume static libs through CMake?
Because you give no examples of doing so, but want to change the status quo. A change in status quo needs to be done when there's a real reason and you are not giving any.
Regarding examples, https://github.com/conda-forge/grpc-cpp-feedstock having to https://github.com/conda-forge/grpc-cpp-feedstock/pull/196 libabseil-static (on windows, due to C++ ABI issues) through CMake is also one.1
I don't have time to go through these. Please explain in detail how this works right now and how having shared only Cmake files is an issue.
But zooming out a bit - why is it controversial that projects might want to consume static libs through CMake? That's a pretty normal thing to be doing (granted, not in conda-forge due to CFEP-18, but the exceptions that exist shouldn't have to be barred from using CMake?)...
Because of CFEP-18. We want to encourage shared builds.
Since you are going into hypotheticals, let me do the same. In the case of onnxruntime, your logic there is faulty as well. What if onnxruntime picks up a dependency that depends on shared libprotobuf? Then onnxruntime cannot link against libprotobuf-static because of the conda package conflict that you are introducing artificially.
So, you are saying that it links statically right now without issue and as soon as shared only CMake files are added, it will link to shared library?
Yes, if we do it like you propose (no CMake targets for the static lib), I believe that would happen. That adding CMake targets is beneficial for other reasons should not be in question I presume? (https://github.com/conda-forge/libprotobuf-feedstock/issues/68 is almost 2 years old)
why is it controversial that projects might want to consume static libs through CMake?
Because you give no examples of doing so, but want to change the status quo. A change in status quo needs to be done when there's a real reason and you are not giving any.
I think working CMake integration is not "not giving any [reason]", see also below.
Because of CFEP-18. We want to encourage shared builds.
Yes, I get that, and those feedstock that diverge don't do this for fun. Those are often tricky packages in the first place, and I don't understand what benefit the status quo has that justifies subtly breaking their ability to use native CMake integration.
Please explain in detail how this works right now and how having shared only Cmake files is an issue.
I put this in the footnotes above already, but in short: per abseil version, we cannot have more than one shared lib, but we need at least three different ABIs (per C++ standard version). Static builds make up the missing ones, and cannot be co-installed (due to different ABI). It's a special case as I said, but it literally cannot be handled (safely/sanely) through a static-depending-on-shared setup.
Since you are going into hypotheticals, let me do the same. In the case of onnxruntime, your logic there is faulty as well. What if onnxruntime picks up a dependency that depends on shared libprotobuf? Then onnxruntime cannot link against libprotobuf-static because of the conda package conflict that you are introducing artificially.
Yes, I noted this in the OP, and explained how this is likely not a problem in practice, and a work-around if it turns out to be. Quoted again for convenience:
The only downside would be that users of
-staticpackages cannot depend on another package (saya) which has been built against a shared version of the same lib. Since most of the static libs live pretty close to the bottom of the stack, I don't think this would be much of a problem. And even so, there would be a solution: build a (perhaps also static?) version ofaagainstlibxyz-static.I think (users of) static libs are special enough that we can inflict this (hypothetical!) pain on them
It's of course a fair question if this is actually more painful than not having working CMake integration, and one where I'll gladly admit defeat if there are many affected feedstocks. But at least it would make the failure immediately visible, and presumably entice those feedstock to try building against the shared lib (if their dependency can do it, why not they themselves as well).
Yes, if we do it like you propose (no CMake targets for the static lib), I believe that would happen.
You are not making sense to me at all. How would this happen?
You are not making sense to me at all. How would this happen?
Currently, the CMake invocation in onnxruntime falls back on other means to detect libprotobuf-static (pkgconfig or whatever), but if we were to add native CMake files ($PREFIX/cmake/protobuf-*.cmake, see equivalent on windows), it would prefer those. If those CMake files only refer to the shared build, the wrong library would be picked up AFAICT.
Currently, the CMake invocation in onnxruntime falls back on other means to detect libprotobuf-static (pkgconfig or whatever), but if we were to add native CMake files ($PREFIX/cmake/protobuf-*.cmake, see equivalent on windows), it would prefer those. If those CMake files only refer to the shared build, the wrong library would be picked up AFAICT.
No, that's not how it works. I took the time to research into this (which you could have done too) and onnxruntime uses cmake to find protobuf. The native CMake files are what CMake calls CONFIG mode in find_package and CMake already has in built config files for supporting protobuf (MODULE mode). MODULE mode is the default and irrespective of the presence of the files that you call native CMake files, it will find the static protobuf files because of the option DProtobuf_USE_STATIC_LIBS=ON used by onnxruntime. To confirm this I first built static library and installed into a prefix and then built shared library and installed on top. onnxruntime was able to find the static library with no issues.
Any other examples where you "need" this?
Any other examples where you "need" this?
I don't know why this discussion needs to be so combative. I'm trying to solve (what I perceive to be) a genuine problem - we can disagree on how broken things are, or what the right solutions are, but can we turn down the intensity a bit?
I said I don't know the onnxruntime - I used it as an example I had found. For me the main point is in principle, for you it's specific examples that justify change. We can also disagree about that.
Still, I feel your investigation underscores my point - it took a non-trivial amount of effort by one of the most knowledgeable people in conda-forge to find out whether my (I'd say, at least) plausible scenario would become relevant (and what if CMake didn't have built-in detection like it has for protobuf, or if onnxruntime didn't use DProtobuf_USE_STATIC_LIBS=ON, etc. ...?).
How is all this effort / complexity in understanding a recipe beneficial, when we could have a completely unambiguous setup where that mixing never even becomes a possibility? I understand that changing something is work and engenders certain risks, but I feel also the status quo should be less set in stone, especially when problems are identified.
Beyond that, I still don't know how a patch-free build setup would look like where libxyz-static depends on libxyz and only the shared libs get CMake targets.
No, it doesn't. You just need cmake {build, install}. No patches necessary.
Could you sketch how to do this based on - for example - the zstd feedstock?
I don't know why this discussion needs to be so combative. I'm trying to solve (what I perceive to be) a genuine problem - we can disagree on how broken things are, or what the right solutions are, but can we turn down the intensity a bit?
Sure. Sorry about that. I urge you to take a couple of hours before replying so as to give the impression that you have thought about things before replying.
How is all this effort / complexity in understanding a recipe beneficial, when we could have a completely unambiguous setup where that mixing never even becomes a possibility? I understand that changing something is work and engenders certain risks, but I feel also the status quo should be less set in stone, especially when problems are identified.
There are use-cases where having both static and shared libraries are needed. For example, if you want to link your program against openblas static library (for performance, ABI, etc), but still want to use numpy in your stack (which comes with openblas shared), your suggestion basically makes it impossible for their use-case. On the other hand, your use-case can be worked-around in the build.sh in any feedstock. So, it's a choice between,
- Make it possible for users who need both shared and static, but make it inconvenient for people who only need static
- Make it convenient for people who only need static, but make it impossible for users who need both shared and static. Option 2 is clearly bad.
Beyond that, I still don't know how a patch-free build setup would look like where libxyz-static depends on libxyz and only the shared libs get CMake targets.
You have the solution right there. Add libxyz as a host dep in libxyz-static. For eg: Add - libzstd at https://github.com/conda-forge/zstd-feedstock/blob/main/recipe/meta.yaml#L90
Thank you all for this discussion as I learned a lot from it!
As someone who ends up in combative conversations often because I am not good at this and I am too quick to respond, I just want to say I appreciate all your work, @h-vetinari, and all your feedback, @isuruf.
@isuruf can we enlist you for a review/view in the abseil/grpc migration?
You have the solution right there. Add
libxyzas a host dep inlibxyz-static. For eg: Add- libzstdat https://github.com/conda-forge/zstd-feedstock/blob/main/recipe/meta.yaml#L90
Needed to be a run-dep as well (to get the headers), but yes, this works (though it's unclear to me which lib gets found by CMake; but I don't care that much anymore, as this approach seems to solve the most issues at once).
Is the following an accurate summary of the discussion?
No top-level package. Build shared first output. Build static second output. Static output depends on shared output in both run and host. Use tests to check that shared output excludes static libraries. Static CMake files probably clobber shared CMake files (depends on build config)?
Is the following an accurate summary of the discussion?
This matches my understanding. Thanks for the summary.