docs
docs copied to clipboard
Explain larger implications of using SCM feature with revisions
The SCM feature has surprised a number of people with some of it's behavior. Users have made a number of feature requests relating to this behavior which seem reasonable. However, many of these have turned out to be in conflict with the advanced ideas and strong guarantees which were conceived the original design of the feature, or outright impossible due to implementation details of Conan 1.0.
The most recent case which has been requested a few times is here.
https://github.com/conan-io/conan/issues/8214
Many users want to use SCM for it's nice declarative syntax, and the ability to run a recipe "out of source", but they want to filter out some files from affecting the Conan recipe revision, just like with exports_sources. This filtering is currently impossible and is in conflict with the strong guarantees SCM intends to make.
Nonetheless, this feature will keep getting requested unless the documentation explains clearly:
- That this is not currently allowed by design
- That the design is intended to provide guarantees about reproducibility, and how those are beneficial
- The resulting tradeoff that users then must accept when using SCM
- That even if we wanted to add the filtering feature, it's impossible in Conan 1.x to record and upload the SCM revision information with the package without generating a new Conan
RREV, because we can only record it to a file, and thus it will always affect the manifest, and thus always affect the hash ofconan_exports.tgz. We would need some a new server API to support any other behavior.
I think we should definitely document something about the above situation and add it to the docs ASAP.
We may also want to add more. For example, there is another major case of unfortunate surprise, was the discovery of an unexpected and detrimental implication relating to GIT workflows and package promotion between Conan repositories when using SCM. Here's an example from earlier experimentation from our CI training course.
- feature branch is created -> SCM revision ABC -> Conan calculates RREV 001 and creates it -> upload to temp repo
- pull request -> SCM revision ABC -> Conan calculates and finds RREV 001 -> Use to rebuild downstream consumers
- merge PR -> SCM revision XYZ -> Conan calculates RREV 002, cannot find it, cannot promote it
In step 3 above, the logical and straightforward thing to do for most cases would be to promote the Conan package of RREV 001 to a "production" repository. It was already built and thoroughly tested with the sources which were changed, and then it was used to build and test all downstream consumers. There is no general/intrinsic need to rebuild the package and all it's dependencies one more time based on the merge commit.
However, because the merge commit changed the Conan revision to RREV 002, the result of promoting RREV 001 would be really awkward and unfortunate. There would be no Conan RREV built which corresponds to the HEAD of the main branch. The entire dependency tree would need to be rebuilt based on the merge commit to create ones. Both options are unacceptable for many organizations. Thus, the advice we should give to organizations who want to do package promotion with GIT workflows is to use exports_sources instead of SCM.
So, we should clearly explain in the documentation that SCM is not the best fit for all use-cases, and tat it specifically leads to this situation with respect to package promotion, and that exports_sources is still a viable option which avoids this situation.
the first time I have seen SCM feature I was thinking it's just a syntax sugar for:
def source(self):
self.run("git clone https://github.com/myorg/myrepo.git")
with the SCM equavalent:
scm = {"type": "git", "url": "https://github.com/myorg/myrepo.git", "revision": "auto"}
then it turned out it's much more complicated feature, and not just a simple wrapper around self.run("git ...")
it actually looked more like an alternative to export_sources, which is completely unreproducible, because it captures:
- local modification
- staged changes
- ignored files
- uncommitted files
this makes export_sources completely unreliable for multiple developers / machines.
on the other hand, SCM controls the integrity of sources and prevents local modifications, enforcing strict reproducibility.
this of course has its own costs, such as now you're tightly coupled with SCM revisions, and have to rebuild if revision changes (even if it's an empty commit, merge commit or change to the README or Jenkins file).
it turns out, it's just an abstraction cost, you either have to accept, or just don't use SCM at all if it's unacceptable for you. it's just like usage of cordless screwdriver requires you to charge an accumulator from time to time, and if you don't accept that fact, you don't use cordless screwdriver, you use a mechanical one.