upload-artifact icon indicating copy to clipboard operation
upload-artifact copied to clipboard

[bug] (v4) Unable to upload to same artifact name from multiple jobs

Open DanTup opened this issue 1 year ago • 39 comments

What happened?

The PR from dependabot to upgrade to v4 is failing on my project with this error:

Error: Failed to CreateArtifact: Received non-retryable error: Failed request: (409) Conflict: an artifact with this name already exists on the workflow run

It seems like this is a breaking change that wasn't mentioned in the changelog and I'm not sure if it was deliberate.

There's some discussion about this behaviour in https://github.com/actions/upload-artifact/issues/279 and it suggests that it was fine to do this and there wouldn't be issues as long as the filenames within the artifact are unique. This was convenient to bundle the logs from several shards together into a single artifact rather than having lots of individual zip files to download.

What did you expect to happen?

I expected everything to work the same as in v3 unless it was noted as a deliberate breaking change.

How can we reproduce it?

Create multiple jobs that upload artifacts with the same name (but the files from each job are uniquely named).

Anything else we need to know?

No response

What version of the action are you using?

v4.0.0

What are your runner environments?

linux, window, macos

Are you on GitHub Enterprise Server? If so, what version?

No response

DanTup avatar Dec 18 '23 10:12 DanTup

Actually, it seems this is called out here:

https://github.com/actions/upload-artifact?tab=readme-ov-file#v4---whats-new:~:text=The%20contents%20of%20an%20Artifact%20are%20uploaded%20together%20into%20an%20immutable%20archive.%20They%20cannot%20be%20altered%20by%20subsequent%20jobs.%20Both%20of%20these%20factors%20help%20reduce%20the%20possibility%20of%20accidentally%20corrupting%20Artifact%20files.

The contents of an Artifact are uploaded together into an immutable archive. They cannot be altered by subsequent jobs. Both of these factors help reduce the possibility of accidentally corrupting Artifact files.

It just wasn't included in the "What's changed" section of the Dependabot release notes because it just has a summary saying "Lots has changed". I should've followed the link through.

Seems like this is certainly intended though.

DanTup avatar Dec 18 '23 10:12 DanTup

Well, this is a bad news for me. I find convenient to use the upload-artifact to write different files to the same folder in a build matrix. For example to compile custom C extensions for several combinations of Python versions and operating systems, and publishing to a single folder.

like in https://github.com/Neoteroi/BlackSheep/actions/runs/7370452109/job/20056867940

Now if I want to upgrade my workflow, I need to publish to different folders and download artifacts from multiple sources - making the workflow look like a mess compared to how clean it used to look like. For now I stay with v3 and I hope this will be reconsidered in a future version of these actions.

RobertoPrevato avatar Dec 31 '23 13:12 RobertoPrevato

Yeah, I rolled back to v3 too. Until I'm forced to upgrade, the old way is much more convenient for me.

DanTup avatar Dec 31 '23 15:12 DanTup

I had to roll back also

seanvaleo avatar Jan 05 '24 18:01 seanvaleo

Same here, I also use a matrix to build multi platform releases in the same directory, and then zipping them all together, rolling back to v3 :(

ptr727 avatar Jan 06 '24 00:01 ptr727

Seems like a lot of people have been bitten by this, so although it appears to have been deliberate I'm re-opening for better visibility to see if the authors want to chime in (of course, it's very possible it may just be closed as WAI).

DanTup avatar Jan 06 '24 09:01 DanTup

The behaviour of several jobs saving different files, with different names, but in the same directory, to be downloaded from a single archive once all the jobs succeeded, was very desireable and didn't require the use of actions/download-artifact.

https://github.com/psycopg/psycopg/blob/fe097e2e4356a4332a54ae21e1c4307bc7c19b4f/.github/workflows/packages-src.yml

Moving to using v4 seems a major change which, for the moment, we will avoid.

dvarrazzo avatar Jan 07 '24 12:01 dvarrazzo

There were only 27 commits according to the v4.0.0 release notes.

Breaking changes (and this definitely is a breaking change) should absolutely be called out in major version bump releases' release notes.

It's explicitly included in the readme: https://github.com/actions/upload-artifact?tab=readme-ov-file#breaking-changes

That same text should absolutely be included in the release notes.

Conveniently GitHub lets you rewrite release notes at any time, so this can and should be fixed.

@robherley you wrote "Blog post coming soon!" in https://github.com/actions/upload-artifact/pull/466#issue-2040382024, I presume that's: https://github.blog/changelog/2023-12-14-github-actions-artifacts-v4-is-now-generally-available/

But it'd be really good if you had added a comment in the PR itself instead of forcing people to Google for it. (You could also include a link to the blog post in the release notes.)

Beyond that, most of its content, which I will excerpt below should be in the release notes, and probably in the readme. Note that the readme content does not match the blog post.

Blog post first:

  • Artifacts will be scoped to a job rather than a workflow. This allows the artifact to become immediately available to download from the API after being uploaded, which was not possible before.
  • Artifacts v4 is not cross-compatible with previous versions. For example, an artifact uploaded using v3 cannot be used with actions/download-artifact@v4.
  • Using upload-artifact@v4 ensures artifacts are immutable, improving performance and protecting objects from corruption, which would often happen with concurrent uploads. Artifacts should be uploaded separately and then downloaded into a single directory using the two new inputs, pattern and merge-multiple, available in download-artifact@v4. These objects can then be re-uploaded as a single artifact.
  • A single job can upload a maximum of 10 artifacts.

Readme:

  1. On self hosted runners, additional firewall rules may be required.
  2. Uploading to the same named Artifact multiple times. Due to how Artifacts are created in this new version, it is no longer possible to upload to the same named Artifact multiple times. You must either split the uploads into multiple Artifacts with different names, or only upload once. Otherwise you will encounter an error.
  3. Limit of Artifacts for an individual job. Each job in a workflow run now has a limit of 10 artifacts.

jsoref avatar Jan 07 '24 17:01 jsoref

My data point against v4: Generating documentations. I have a matrix of jobs (for different versions of build environments), each generates a documentation. I don't want to specify "only generate the docs on this particular version", as the matrix of versions change frequently. I don't care "which" job overwrites the docs generated by "which" other job, as the docs are mostly the same: Just give me any one of them.

The "correct" way of doing things: Name each artifact after the job matrix. But: The jobs matrix is specified by the docker images e.g. "username/repo:version", which is a bad filename. I really don't want to write a script just to compute a valid filename for the artifact.

liyishuai avatar Jan 08 '24 10:01 liyishuai

Same here, just rolled back to v3 😞

GyulyVGC avatar Jan 09 '24 12:01 GyulyVGC

Just want to chip in that this caused a lot of issues for our company as well.

jontingvold avatar Jan 10 '24 10:01 jontingvold

👋 @RobertoPrevato For your example, you don't have to make too many changes, e.g.

https://github.com/Neoteroi/BlackSheep/blob/b283414c88e2d32675a1ca982d937a5dab75b532/.github/workflows/main.yml#L209-L212

Change that line to:

      - uses: actions/upload-artifact@v4
        with:
          name: dist-${{ matrix.os }}-${{ matrix.python-version }}
          path: dist

Then, in your publish job:

https://github.com/Neoteroi/BlackSheep/blob/b283414c88e2d32675a1ca982d937a5dab75b532/.github/workflows/main.yml#L226-L230

You can have it download all the artifacts matching a pattern to the same directory:

      - name: Download a distribution artifact
        uses: actions/download-artifact@v4
        with:
          pattern: dist-*
          merge-multiple: true
          path: dist

This case is outlined in the migration document: https://github.com/actions/download-artifact/blob/main/docs/MIGRATION.md

I'm happy to help any others with their workflow scenarios, thanks all for the feedback!

robherley avatar Jan 10 '24 19:01 robherley

@robherley Thank You! I appreciate your help very much, I try that as soon as I get the time.

RobertoPrevato avatar Jan 10 '24 19:01 RobertoPrevato

@robherley

Uploads and downloads must use the same actions versions.

It isn't obvious that you mean "the same major action version." -- If that's what's intended.

jsoref avatar Jan 12 '24 20:01 jsoref

@robherley saved my life to not downgrading to v3! Thanks!

gandarez avatar Jan 12 '24 23:01 gandarez

I rolled back to v3, too.

valeriosalvucci avatar Jan 31 '24 09:01 valeriosalvucci

Solution provided in the comment, work just fine for v4 actions

- name: Upload artifacts
  uses: actions/upload-artifact@v4
  with:
    name: dist-${{ matrix.os }}
    path: dist

- name: Download artifacts
  uses: actions/download-artifact@v4
  with:
    pattern: dist-*
    merge-multiple: true
    path: dist

air3ijai avatar Jan 31 '24 09:01 air3ijai

I rolled back to v3, too. Anyone knows any alternative repo for that - maybe a fork based on v3 ???? (update) . I started thinking about my own v4 fork and removing that "feature" .Checking now how hard will be to keep my fork synchronised with this repo

donfirst avatar Jan 31 '24 15:01 donfirst

@robherley in https://github.com/actions/upload-artifact/issues/478#issuecomment-1885470013 you pointed to: https://github.com/actions/download-artifact/blob/main/docs/MIGRATION.md

I believe I've made a faithful reproduction of your workflow (v3 and v4), and it is not a drop-in replacement. https://github.com/check-spelling-sandbox/artifact-merge-hell/actions/runs/7729203589/job/21071843037

-Run ls -R my-v3-artifact
-my-v3-artifact:
-my-v3-artifact
-
-my-v3-artifact/my-v3-artifact:
+Run ls -R my-v4-artifact
+my-v4-artifact:
 file-macos-latest.txt
 file-ubuntu-latest.txt
 file-windows-latest.txt

Since paths are how people find files, the fact that the paths do not match would break anyone trying to use it.

I'm not sure that's precisely why I gave up, but I can assure you it is one of the problems I encountered.

jsoref avatar Jan 31 '24 16:01 jsoref

👋 @jsoref apologies, I had a typo in the migration docs.

The reason why you have an extra my-v3-artifact/my-v3-artifact in the v3 download is the behavior outlined here in the v3 docs.

If the name input parameter is not provided, all artifacts will be downloaded. To differentiate between downloaded artifacts, a directory denoted by the artifacts name will be created for each individual artifact.

These lines should be:

with:
  name: my-artifact
  path: my-artifact

In v4, this behavior is now toggleable with the merge-multiple parameter.

I'll update the migration docs to include the name parameter.

robherley avatar Jan 31 '24 17:01 robherley

Thanks, with that change, the results now do look compatible: https://github.com/check-spelling-sandbox/artifact-merge-hell/actions/runs/7730117258

jsoref avatar Jan 31 '24 17:01 jsoref

Looks like a https://github.com/actions/upload-artifact/tree/main/merge should be solution for this problem. PR https://github.com/actions/upload-artifact/pull/505

ktrzcinx avatar Feb 02 '24 11:02 ktrzcinx

Take a look at https://github.com/actions/upload-artifact/blob/main/docs/MIGRATION.md#overwriting-an-artifact

      - name: New override option
        uses: actions/upload-artifact@v4
        with:
          name: build-artifact
          path: ./example
          overwrite: true

MatthewPattell avatar Feb 02 '24 16:02 MatthewPattell

Solution provided in the comment, work just fine for v4 actions

- name: Upload artifacts
  uses: actions/upload-artifact@v4
  with:
    name: dist-${{ matrix.os }}
    path: dist

- name: Download artifacts
  uses: actions/download-artifact@v4
  with:
    pattern: dist-*
    merge-multiple: true
    path: dist

It works like a charm, thank you!

panpuchkov avatar Feb 08 '24 18:02 panpuchkov

Here is an example how it could be solved for BigBlueButton: https://github.com/bigbluebutton/bigbluebutton/pull/19777/commits/0f726d53d6e2795e31ff1500b2e02705ed3d4f8f.

stweil avatar Mar 23 '24 17:03 stweil

Rolling back to v3 as well :( actions/upload-artifact@v4 broke the entire pipeline, and none of the mentioned solutions solved our problem. In our workflow, we run parallel jobs in matrix strategy that generate loads of files that were uploaded into a single results folder in v3. With v4, in our parallel job, we upload a large number of folders with unique names and use one more job action/upload-artifact/merge@v4 to merge these folders into a single one that is downloadable through the UI. The solution technically works but is unsuitable for us because we are getting all the artifact folders from parallel jobs uploadable through the UI, which we don't need, and it makes a big mess in our artifacts.

kryshenp avatar Apr 02 '24 12:04 kryshenp

Also had to revert to v3 here https://github.com/microsoft/cppwinrt/pull/1409

kennykerr avatar Apr 05 '24 14:04 kennykerr

@kryshenp, @kennykerr, it's not necessary (and also not a good idea) to revert to v3 (which will stop working in the future). Read my comment above how to fix your code to make v4 working.

stweil avatar Apr 05 '24 15:04 stweil

Hi,

At first I was also surprise from this breakage, yes it creates an effort of migration and unexpected, I also immediately reverted, but while thinking of it realized why it was done and migrated our usage.

The previous implementation was probably a mistake, the new implementation is more consistent.

The advantages of the new stateless implementation:

  1. Artifacts are available immediately after creation, no need to wait until the workflow completes.
  2. Implementation is much faster as the artifacts are not merged when not needed, it takes less space to manage the process.
  3. Consistent approach of artifacts within workflow or reusable workflow.
  4. Consistent approach when running/rerunning partial workflow.

I hope the above helps to understand the WHY, it is easier to perform migration when we understand the WHY.

Thanks, Alon

alonbl avatar Apr 05 '24 15:04 alonbl

It was just announced that v3 will be disabled on 2024-11-30.

https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/

It is disappointing that the disabling of v3 was announced before v4 reaching feature-completeness.

jennydaman avatar Apr 16 '24 18:04 jennydaman