kfp-tekton
kfp-tekton copied to clipboard
push_artifact does not push output artifacts to s3 in copy-artifacts step
/kind bug
What steps did you take and what happened:
When using data passing via kfp.components.OutputPath(), kfp.components.InputPath(), we notice that artifacts never make it in to s3 storage, instead we see a 0kb file named after the output files.
What did you expect to happen: Output artifacts show up in s3 storage.
Additional information:
When reproducing it please use a separate backing storage for pvc than the s3 solution for apiserver.
Take for example:
- Pipeline in kfp dsl
- Same Pipeline after it's fed through kfp-tetkon compiler
- Same Pipeline after it is adjusted by apiserver, post submit
Once the Pipeline in (1) makes it to api-server (3), we see new task steps added to manage artifact passing/tracking.
The final step added by api server is copy-artifacts
, this step pushes the artifacts in this task to s3 storage via the push_artifacts script. The problem we are seeing is that when the artifact is >4kb, this fails.
This step expects the artifact to be in /tekton/home/tep-results
, but what you find there is just a file of the artifact output name that is 0kb. This occurs because copy-results-artifacts
does not copy the artifact to /tekton/home since it's too big >3072 bytes:
if [ -d /tekton/results ]; then mkdir -p /tekton/home/tep-results; mv /tekton/results/* /tekton/home/tep-results/ || true; fi
this seems to take /tekton/results/
and send it to /tekton/home
, from the preceding step copy-results-artifacts
we see:
copy_artifact $(workspaces.produce-output.path)/artifacts/simple-pipeline-fe138/$(context.taskRun.name)/mydestfile $(results.mydestfile.path)
So we're expecting contents in the /workspace
to move to /tekton/results
so it can be moved to /tekton/home
in the next step.
But when the pipeline is fed through compiler in (2) above, we see that the script in copy-results-artifacts that is added will only move contents of /workspace here, if it's <3072 bytes. (Makes sense because we have to maintain a <4kb to avoid the termination error messages right?)
And since this file is ~20MB that doesn't happen, and instead we end up with the empty file created here instead, and this ends up trickling in to push_artifact
here.
We noticed that simply fetching the push_artifact
output artifact path arguments from the paths stored in tekton.dev/artifact_items
seemed to work, example here. Which could maybe be a trivial change, I'm not sure if it's accounting for everything though.
As a workaround we are looking to using a custom push_artifact script that will look for the artifact in workspaces (if it exists) then push this path to s3.
Environment:
- SDK Version: 1.5.1
- Tekton Version (use
tkn version
): 0.47.x - Kubernetes Version (use
kubectl version
): 1.25
We notice the same behavior when not using .add_pod_annotation()
for pipelines that use data passing. Example.
We notice the same behavior when not using .add_pod_annotation() for pipelines that use data passing
it's a bit nuanced. What I've seen is:
- if I have a 2-step pipeline ... step1 with an output, and step2 with an input and and output ->
- if I leave off the
artifact_outputs
annotation on step2, step2's artifact gets uploaded to minio, but step2 goes to failure state with the messageError while handling results: Termination message is above max allowed size 4096
. Example - if I include the
artifact_outputs
annotation on step2, step2's uploads a 0-byte tgz archive to minio, and step2 shows success. Example
- if I leave off the
I can't get both a success state and a successful upload at the same time.
@gregsheremeta this is because, in your example you'll notice that the artifact output gets moved to /tekton/results
.
Since push_artifact
is pushing everything in /tekton/results (via /tekton/home
-> /tekton/results
in copy-results-artifacts
), thus in this case the artifact will get pushed to s3. But because now we have the /tekton/results
containing a file >4kb, we get the termination error message.