packit-service
packit-service copied to clipboard
Please respect `SourceN:` in spec files for upstream tarballs in release `copr_build`s
So far we've used packit mostly for validating upstream PRs. In these cases, one always needs to specify create-archive:, as there is no existing tarball to download from anywhere. So it usually says something like make dist
This is different for release builds, both for propose_upstream: and copr_build: These are triggered on releases, which usually produce an official tarball and attach it as a release artifact (or use GitHub's autogenerated ones). It is quite important that distributions use the exact upstream tarball, for validation purposes, and avoiding subtle breakage from rebuilding them.
However, naïvely using create-archive: at the top does exactly that: It rebuilds the release tarball from the local git checkout instead of downloading it from upstream. As an effect, the official cockpit-ostree-190.1.tar.xz is different from the tarball in the srpm of the packit copr_build.
Today I spent some time pestering @TomasTomecek and doing some experiments, and Tomas pointed out that it's possible to move the create-archive: action to the job: tests:
- job: tests
trigger: pull_request
metadata:
targets:
- fedora-35
- fedora-36
- fedora-development
- centos-stream-9-x86_64
actions:
# tarball for upstream PRs
post-upstream-clone: make cockpit-podman.spec
create-archive: make dist
That then leaves the default action for all the other (release based) builds, like propose_upstream and copr_build. But I tested that, and it seems the implied default is to run
Command: git archive --output cockpit-podman-47.3.tar.gz --prefix cockpit-podman-47.3/ HEAD
This is rather bad! For many projects it will not even work (as it's e.g. missing autoconf.sh or node_modules or any other steps that happen for the official release tarball), but the bigger problem is that for some projects it will work (kinda) -- but then the packages uploaded to Fedora don't use the official upstream release, but some random untested build.
IMHO it would make much more sense to respect the .spec file, and try and download its SourceN: tarballs. As propose-upstream/copr-build etc. are triggered on release, that's not even a race condition -- by the time the release is published, all the release artifacts should exist.
If that fails, I suggest to just fail the release -- it's better to be explicit about it than trying to cobble together some tarball. If you want to keep that, then can it be a fallback if Source0: does not work?
In https://github.com/cockpit-project/cockpit-podman/pull/978 I have a hack for this:
actions:
post-upstream-clone: make cockpit-podman.spec
# tarball for releases (copr_build, koji, etc.)
create-archive:
- sh -exc "curl -L -O https://github.com/cockpit-project/cockpit-podman/releases/download/${PACKIT_PROJECT_VERSION}/${PACKIT_PROJECT_NAME_VERSION}.tar.xz"
- sh -exc "ls ${PACKIT_PROJECT_NAME_VERSION}.tar.xz"
But this is dreadful -- it's hard to discover, hard to get right (like, missing curl's -L option or forgetting to wrap the thing in sh -ex), and duplicates information such as the download URL, which makes it harder to release from forks, for example.
I suggest to just fail the release -- it's better to be explicit about it than trying to cobble together some tarball.
By that I meant that adding create-archive: git archive ... to packit.yml explicitly is making it much more obvious what will happen/what the developer wants. But this is really bad practice, and if someone does that for Fedora (or any other distro) builds for projects which do have official tarballs (almost all) they need to get a talk. :grin:
It's Friday and my brain works only on 100 MHz. I believe Packit already does this because that's exactly what we do ourselves - perform official archive generation and upload to PyPI in a GHA and then pick it up via propose_downstream. Sorry I did not realize this while we spoke on chat.
https://github.com/packit/packit/blob/main/packit.spec#L11
Source0: %pypi_source
That should expand to a URL.
I also just verified that this is what Packit does in propose_downstream:
- Fetch archive specified in Source0
- Upload it to lookaside cache
https://github.com/packit/packit/blob/85cf200a4d2410237ed562ec306f97de7ed588b2/packit/api.py#L971-L973
If that fails, I suggest to just fail the release -- it's better to be explicit about it than trying to cobble together some tarball. If you want to keep that, then can it be a fallback if Source0: does not work?
We instead retry the release a few times.
Again, sorry Pitti I only realized this just now.
Therefore I presume the create-archive action in propose_downstream is no-op. Can someone from my team confirm?
For the cockpit projects we started slowly with copr_build -- so perhaps that does something else than propose_downstream? My cited cockpit-ostree example got a rebuilt tarball, but that still has create-archive: on the top level, instead of on the job: tests stanza, so it's possible that it behaves that way (it's a bit difficult to test..).
So if that is true, this issue would ask about using the same behaviour for copr_build.
What I don't understand: https://github.com/packit/packit/blob/main/.packit.yaml has create-archive: at the top-level as well. Why is that not being used for propose_downstream:?
It really looks like it would. Your official upstream tarball:
❱❱❱ curl -L https://github.com/packit/packit/archive/refs/tags/0.51.0.tar.gz | sha512sum
aa0c8bd4129656fa8244e701803fd527470e1a99d8fca9c6b031425fce42097b8e41fc0194b9b09569f20fcb6762994ac404c7ac83d9c18321a0917df378a463 -
However, dist-git has a different sha, namely 7e551fbf711f4835db..
So it very much looks to me that packit itself has the exact same problem that I'm trying to solve here?
λ curl -L https://files.pythonhosted.org/packages/63/76/87bda3ca8f4403e128b55c8a319e496175abc71a5f2a7f4b7008d29768de/packitos-0.51.0.tar.gz | sha512sum
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 17.3M 100 17.3M 0 0 9186k 0 0:00:01 0:00:01 --:--:-- 9184k
7e551fbf711f4835db0c6276653bdfeb72fbcf3330378d8d2bbc688a4c9691b13c3ab36bcde408f4aab7d9684a441a35afb7ad830c941dfb28392547c215244d -
It is same as the archive at PyPI.
I see, the github tarballs are just the automatically generated ones. Indeed packit 0.51.0's srpm build does create a tarball:
2022-05-13 12:11:19.623 base_git.py INFO Using user-defined script for ActionName.create_archive: [['python3', 'setup.py', 'sdist', '--dist-dir', '.'], ['sh', '-c', 'echo packitos-$(python3 setup.py --version).tar.gz']]
2022-05-13 12:11:19.623 commands.py DEBUG Command: python3 setup.py sdist --dist-dir .
[...]
2022-05-13 12:11:20.080 logging.py INFO Creating tar archive
2022-05-13 12:11:21.001 logging.py INFO removing 'packitos-0.51.0' (and everything under it)
2022-05-13 12:11:21.030 commands.py DEBUG Command: sh -c echo packitos-$(python3 setup.py --version).tar.gz
So this seems to work because your tarballs happen to be reproducible? We have some efforts of making that happen, but from packit's POV this is pretty much accidental, and nothing enforces this.
Ahaaa, I missed that. Then your curl approach resolves the issue in the meantime.
In the longterm, is the ask here that propose_downstream does not utilize create-archive and instead fetches the SourceN archive?
Edit: now that I thought about this a bit more, we should also implement checking of the archives if create-archive will be available during propose_downstream. And verify that the upstream archive is matching the release pubkey.
is the ask here that propose_downstream does not utilize create-archive and instead fetches the SourceN
That's what should eventually happen, yes. I am less sure if it should just ignore the top-level create-archive:, or have a more sensible default than git archive -- my gut feeling is currently that a pull_request specific explicit create-archive: in project's packit.yaml, plus and an implied default of fetching SourceN is the right thing to do. Ignoring explicit configuration feels weird.
Thanks, Martin! I nudged Franta to look at this but he's still going through a backlog of info after coming back from vacation :)
We will for sure talk about this and start addressing it. I think we should come up with a comprehensive solution here and not do a quick & dirty fix. A lot of people actually ask us how to do upstream releases properly so this is a good opportunity to come up with best practices and make the releasing convenient with Packit and propose_downstream.
@TomasTomecek : Fully agreed, I appreciate! Especially as there's the curl workaround, so this isn't a blocker for anyone.
@martinpitt Hello, finally there..;)
First things first:
- The
propose-downstreamandcopr_buildbehaves differently. - Copr builds for all triggers behave from the implementation point mostly the same. (Just default project naming differences, ...) Not saying we can't change that.
For the propose-downstream the archive manipulation works as follows (for reference, the code is here:
https://github.com/packit/packit/blob/213c789c5ec496e4471dbdb7279d9d7d61b4ed44/packit/api.py#L937)
- Download the upstream archive.
- Send it to the lookaside-cache if not present.
For copr_build, there is a create_archive action with the git archive as default.
So. I see basically two approaches:
- Respect the
Sourcefield forcopr_buildon release and just download what's there. (Ideally, use some retries as we use forpropose-downstreamto resolve some delays on source generation.) - Use a different default for
copr_buildon release -- download theGitHubrelease archive but allow redefinition. (So one can still build the archive from scratch if wanted/needed.)
Just a note to Packit itself: Since there is no create_archive for propose_downstream, we can define that action on the top level without any issues: https://packit.dev/docs/actions/#propose-downstream-command
Respect the Source field for copr_build on release and just download what's there. (Ideally, use some retries as we use for propose-downstream to resolve some delays on source generation.)
- :heavy_plus_sign: pristine source used, no spec-file change, very transparent
- :heavy_minus_sign: can't be easily redefined (maybe some hook to do that manually if really needed?)
Use a different default for copr_build on release -- download the GitHub release archive but allow redefinition. (So one can still build the archive from scratch if wanted/needed.)
- :heavy_plus_sign: can be redefined
- :heavy_plus_sign: old behaviour can be preserved
- :heavy_minus_sign: spec-file is changed and uses a local file (can be solved)
@lachmanfrantisek : Thanks for the details! It feels to me that releasing to a COPR or to Fedora shouldn't make much difference wrt. tarballs, they are all just slightly different release targets?
If propose_downstream really does download the tarball from the release, then why does the packit 0.51.0 release srpm call python3 setup.py sdist through the create_archive action (defined in .packit.yml)? Is that just some misleading log, i.e. it doesn't actually use that tarball? I can't see any download from github in the log.
I have no strong opinions between your two alternatives. The first one feels more robust against accidental rebuilds (when project maintainers just set a top-level create-archive:), the second one seems cleaner. I guess I looked at this for too long now to give a good opinion, maybe some poll in #packit or twitter?
If
propose_downstreamreally does download the tarball from the release, then why does the packit 0.51.0 release srpm callpython3 setup.py sdistthrough thecreate_archiveaction (defined in .packit.yml)? Is that just some misleading log, i.e. it doesn't actually use that tarball? I can't see any download from github in the log.
The SRPM build log is coming from a copr_build job so it uses that action. (Until we resolve this issue...;)
It uses this job config: https://github.com/packit/packit/blob/213c789c5ec496e4471dbdb7279d9d7d61b4ed44/.packit.yaml#L83
and you can see the picked job in the logs:
2022-05-13 12:11:18.666 utils.py DEBUG job_config_index: 6
2022-05-13 12:11:18.666 utils.py DEBUG Final package (job) config: JobConfig(job=JobType.copr_build, trigger=JobConfigTriggerType.release, config_file_path='.packit.yaml', specfile_path='packit.spec', files_to_sync='[SyncFilesItem(src=['packit.spec'], dest=packit.spec, mkpath=False, delete=False, filters=[]), SyncFilesItem(src=['.packit.yaml'], dest=.packit.yaml, mkpath=False, delete=False, filters=[]), SyncFilesItem(src=['plans/'], dest=plans/, mkpath=False, delete=False, filters=[]), SyncFilesItem(src=['.fmf/'], dest=.fmf/, mkpath=False, delete=False, filters=[])]', dist_git_namespace='rpms', upstream_project_url='https://github.com/packit/packit', upstream_package_name='packitos', downstream_project_url='https://src.fedoraproject.org/rpms/packit.git', downstream_package_name='packit', dist_git_base_url='https://src.fedoraproject.org/', actions='{<ActionName.create_archive: 'create-archive'>: ['python3 setup.py sdist --dist-dir .', "sh -c 'echo packitos-$(python3 setup.py --version).tar.gz'"], <ActionName.get_current_version: 'get-current-version'>: ['python3 setup.py --version'], <ActionName.pre_sync: 'pre-sync'>: ['python3 plans/git_reference.py']}', upstream_ref='None', allowed_gpg_keys='['5DE3E0509C47EA3CF04A42D34AEE18F83AFDEB23']', create_pr='True', sync_changelog='False', create_sync_note='True', spec_source_id='Source0', upstream_tag_template='{version}', patch_generation_ignore_paths='[]',patch_generation_patch_id_digits='4',copy_upstream_release_description='True',sources='[]', merge_pr_in_ci=True, srpm_build_deps=['python3-pip', 'python3-setuptools_scm'], identifier='None', packit_instances=[<Deployment.prod: 'prod'>, <Deployment.stg: 'stg'>], issue_repository='https://github.com/packit/packit', release_suffix='None', targets={'fedora-all': {}, 'epel-8': {}}, timeout=7200, owner=None, project=packit-releases, dist_git_branches=set(), branch=None, scratch=False, list_on_homepage=True, preserve_project=True, additional_packages=[], additional_repos=[], fmf_url=None, fmf_ref=None, use_internal_tf=False, skip_build=False,env={},enable_net=True)
Ah, sorry! Indeed the rawhide propose_upstream log is this one, that looks much better. So retitling accordingly. I'm glad that this is much less of an issue than I feared, sorry for the noise!
Some work was done on this issue and @lachmanfrantisek promised to go through again and assess what needs to be done to resolve this.
I have a new proposal combining the two mentioned above (basically what Martin originally requested and a generic way of what Cockpit does currently):
For copr_build on release, we will use create_archive action if specified. If not, don't build anything and download what's present in Specfile. (Basically, changing the default for create_archive to download the SourceN.)
A few things to consider:
-
This will not change anything for users with the
create_archiveaction defined. -
For users not setting the
create_archiveaction, this will change the behaviour. (!) -
What about spec-file version? (A very related to packit/packit#1757)
- By now, we don't change the version for release Copr builds.
- No-bump works well for upstream-handled spec-files (=spec-file is prepared), bump is required for generated/downloaded spec-files (=spec-file is not prepared).
- Possible solutions:
- Introduce a
bump_spec_versionconfig option both forcopr-buildandpropose-downstreamthat will beTruefor PRs and commits andFalsefor releases.- If a user has the spec-file prepared,
bump_spec_version=Falseshould be used, if a user needs to update the spec-file,bump_spec_version=Trueshould be used. (This will allow using a dummy value upstream without any manual change required.) - We need to be careful, where we get the version from -- if
bump_spec_version==True, we can't load a version from spec-file (basically an issue packit/packit#1757) but useupstream_tag_templateto get it from the tag.
- If a user has the spec-file prepared,
- Introduce a
-
What about non-URL sources? (Related to packit/packit#1665)
- Require URL or trigger the build action we use currently?
-
What do we want to do if the source is not available?
- Do the same thing that we do for
propose-downstream=> retry. But how to retry when we run the action in Copr (what will be the default in the future)? Can we check the source existence before running the action?
- Do the same thing that we do for
-
To be able to replicate this behaviour, we can implement a
--no-build/--pristine-source/--download-sourceoption for CLI. (We can also start with this.) Note it's a bit different to using--no-bumpCLI option since it will still trigger the local build. @mfocko You were playing with the build CLI lately, does this make sense?
As you can see, we need to be super careful to not break more than we improve..;)
What about non-URL sources? What do we want to do if the source is not available?
These would break propose_downstream as well, right? This could break if some project uses packit to only release to COPR (but not Fedora), and does not release proper upstream tarballs either, right? The solution would be to define a create-archive: action for the copr_build/release job then?
Thanks @lachmanfrantisek for taking this on!
Since there is an hack, for now we don't prioritize this one.
We've come across this when regularly checking the least-recently-commented cards.
I haven't seen anyone else asking for this even though this might be really handy, @martinpitt do you have any issue with your workaround? If the ~~Copr~~Cockpit team does not have any issue with this, I will leave this open but we will prioritise other tasks until there is a stronger push.
When thinking about this, we might want to reuse sources dictionary and add a new option to mention we are relying on specifle. (E.g. preserve_specfile_url, or something like that.
@lachmanfrantisek no, the workaround has worked fine for the last 2 years. It's very non-obvious to come up with, but once you know it it's easy to maintain. So no urgency from our side. Thanks for checking!