argo-workflows icon indicating copy to clipboard operation
argo-workflows copied to clipboard

Output artifact compressed even when `archive: none: {}`

Open casperakos opened this issue 2 years ago • 8 comments

Pre-requisites

  • [X] I have double-checked my configuration
  • [X] I can confirm the issues exists when I tested with :latest
  • [ ] I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

When I'm using archive: none {}, this setting is not respected and the output still getting compressed.

Version

3.3.4

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

kind: Workflow
metadata:
  generateName: artifact-disable-archive-
spec:
  entrypoint: artifact-disable-archive
  templates:
  - name: artifact-disable-archive
    steps:
    - - name: generate-artifact
        template: whalesay
    - - name: consume-artifact
        template: print-message
        arguments:
          artifacts:
          - name: etc
            from: "{{steps.generate-artifact.outputs.artifacts.etc}}"
          - name: hello-txt
            from: "{{steps.generate-artifact.outputs.artifacts.hello-txt}}"
          - name: hello-txt-nc
            from: "{{steps.generate-artifact.outputs.artifacts.hello-txt-nc}}"

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["cowsay hello world | tee /tmp/hello_world.txt | tee /tmp/hello_world_nc.txt ; sleep 1"]
    outputs:
      artifacts:
      - name: etc
        path: /etc
        archive:
          none: {}
      - name: hello-txt
        path: /tmp/hello_world.txt
        archive:
          none: {}
      - name: hello-txt-nc
        path: /tmp/hello_world_nc.txt
        archive:
          tar:
            # no compression (also accepts the standard gzip 1 to 9 values)
            compressionLevel: 0

  - name: print-message
    inputs:
      artifacts:
      - name: etc
        path: /tmp/etc
      - name: hello-txt
        path: /tmp/hello.txt
      - name: hello-txt-nc
        path: /tmp/hello_nc.txt
    container:
      image: alpine:latest
      command: [sh, -c]
      args:
      - cat /tmp/hello.txt && cat /tmp/hello_nc.txt && cd /tmp/etc && find .

Logs from the workflow controller

time="2023-01-24T09:53:24.054Z" level=info msg="sub-process exited" argo=true error="<nil>"
time="2023-01-24T09:53:24.054Z" level=info msg="/etc -> /var/run/argo/outputs/artifacts/etc.tgz" argo=true
time="2023-01-24T09:53:24.055Z" level=info msg="Taring /etc"
time="2023-01-24T09:53:24.118Z" level=info msg="archived 1093 files/dirs in /etc"
time="2023-01-24T09:53:24.119Z" level=info msg="/tmp/hello_world.txt -> /var/run/argo/outputs/artifacts/tmp/hello_world.txt.tgz" argo=true
time="2023-01-24T09:53:24.119Z" level=info msg="Taring /tmp/hello_world.txt"
time="2023-01-24T09:53:24.120Z" level=info msg="/tmp/hello_world_nc.txt -> /var/run/argo/outputs/artifacts/tmp/hello_world_nc.txt.tgz" argo=true
time="2023-01-24T09:53:24.120Z" level=info msg="Taring /tmp/hello_world_nc.txt"

The /etc folder is also being archived where it should not.

casperakos avatar Jan 24 '23 09:01 casperakos

@casperakos can you try it with v3.4.4?

sarabala1979 avatar Jan 30 '23 19:01 sarabala1979

Hi @sarabala1979 ,

Same with version 3.4.4:

time="2023-01-31T12:33:02.312Z" level=info msg="sub-process exited" argo=true error="<nil>"
time="2023-01-31T12:33:02.312Z" level=info msg="/etc -> /var/run/argo/outputs/artifacts/etc.tgz" argo=true
time="2023-01-31T12:33:02.313Z" level=info msg="Taring /etc"
time="2023-01-31T12:33:02.370Z" level=info msg="archived 1093 files/dirs in /etc"
time="2023-01-31T12:33:02.371Z" level=info msg="/tmp/hello_world.txt -> /var/run/argo/outputs/artifacts/tmp/hello_world.txt.tgz" argo=true
time="2023-01-31T12:33:02.371Z" level=info msg="Taring /tmp/hello_world.txt"
time="2023-01-31T12:33:02.371Z" level=info msg="/tmp/hello_world_nc.txt -> /var/run/argo/outputs/artifacts/tmp/hello_world_nc.txt.tgz" argo=true
time="2023-01-31T12:33:02.371Z" level=info msg="Taring /tmp/hello_world_nc.txt"

casperakos avatar Jan 31 '23 12:01 casperakos

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.

stale[bot] avatar Mar 25 '23 11:03 stale[bot]

FYI I just ran some work and I'm still seeing this kind of thing in logs in v3.4.8. As noted above, the wait container logs suggest that an artifact gets tarred and a .tgz moves around but the final output blob stored in the archive looks good on quick look.

brews avatar Jun 08 '23 19:06 brews

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Sep 17 '23 11:09 stale[bot]

it looks like there are 2 different paths of archiving ->

path 1 (has strategy support, like none):

https://github.com/argoproj/argo-workflows/blob/v3.4.8/cmd/argoexec/commands/wait.go#L51 https://github.com/argoproj/argo-workflows/blob/v3.4.8/workflow/executor/executor.go#L291 should see "Saving output artifacts" in logs if went down this path https://github.com/argoproj/argo-workflows/blob/v3.4.8/workflow/executor/executor.go#L298 https://github.com/argoproj/argo-workflows/blob/v3.4.8/workflow/executor/executor.go#L313 https://github.com/argoproj/argo-workflows/blob/v3.4.8/workflow/executor/executor.go#L410

path 2 (no strategy support, always does tarball):

https://github.com/argoproj/argo-workflows/blob/v3.4.8/cmd/argoexec/commands/emissary.go#L201 https://github.com/argoproj/argo-workflows/blob/v3.4.8/cmd/argoexec/commands/emissary.go#L271 should see "->" in logs if went down this path https://github.com/argoproj/argo-workflows/blob/v3.4.8/cmd/argoexec/commands/emissary.go#L281 https://github.com/argoproj/argo-workflows/blob/f73e7f9f6f70beafbbd0fbb49870336698f612af/util/archive/archive.go#L27

since your log has "->" but not "Saving output artifacts" it means path 2 is being used.

i wonder if both main and wait container are uploading same file as artifacts, so same files going twice 🤔 @terrytangyuan ?

tooptoop4 avatar Nov 14 '24 19:11 tooptoop4

emptyDir should be used for emissary to skip tar output artifacts.

https://argo-workflows.readthedocs.io/en/release-3.5/empty-dir/

lonsdale8734 avatar Dec 01 '24 09:12 lonsdale8734

Hey, I'm experiencing the same issue, setting compressionLevel or changing to archive=none is not been respected. I'm using argo-workflows v3.5.13

edenavital avatar Jun 05 '25 14:06 edenavital