oras icon indicating copy to clipboard operation
oras copied to clipboard

improve error message of `oras cp`

Open qweeah opened this issue 1 year ago • 5 comments

What happened in your environment?

Currently the error output of oras cp is confusing

oras cp --to-distribution-spec v1.1-referrers-tag $source_repo $target_repo
...
Error response from registry: sha256:0f58dd64e1e2b1f2c96e2b2cab30402540724a399061b035051534808114144a: not found	

What did you expect to happen?

The returned error should clarify

  1. Where the error occurs in (which target, source or destination)
  2. What operation causes the error.

Depends on oras-project/oras-go#677

How can we reproduce it?

Run oras cp with error returned during copying.

What is the version of your ORAS CLI?

v1.2.0

What is your OS environment?

Ubuntu 20.04

Are you willing to submit PRs to fix it?

  • [ ] Yes, I am willing to fix it.

qweeah avatar Aug 06 '24 06:08 qweeah

Starting from v2.6.0, the underlying oras-go library returns structured errors for copy operations.

For example, a "not found" error during the "Resolve" operation on the source target is represented as:

{
    "Op": "Resolve",
    "Origin": "source",
    "Err": "not found"
}

Similarly, an "unauthorized" error during the "Push" operation on the destination target is represented as:

{
    "Op": "Push",
    "Origin": "destination",
    "Err": "unauthorized"
}

Based on these structured errors, we can build a more user-friendly error message.

Previously, the CLI prepended errors with the text "Error response from registry" to indicate that the error came from the registry rather than the ORAS client.

A new error message format can be considered as follows:

Error: operation "<Op>" failed on <Origin> <target_type> "<target_path>" (reference: "<target_reference>"): <Err>

Where <target_type> can be OCI layout, remote repository, or directory. <target_path> can be a path to a repository or a directory. <target_reference> can be either a tag or a digest.

This template clearly indicates which operation failed, where it occurred, and the reason for the failure.

Here are some examples:

Examples

Copy error

Before

$ oras cp docker.io/library/hello-world:v1 localhost:5000/test:v1
Error response from registry: failed to perform "FetchReference" on source: docker.io/library/hello-world:v1: not found

After

$ oras cp docker.io/library/hello-world:v1 localhost:5000/test:v1
Error: operation "FetchReference" failed on source remote repository "docker.io/library/hello-world" (reference: "v1"): docker.io/library/hello-world:v1: not found

Pull error

Before

$ oras pull --oci-layout test:v1
Error: failed to perform "Fetch" on source: sha256:f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2: application/vnd.oci.image.layer.v1.tar: not found

After

$ oras pull --oci-layout test_broken:v1
Error: operation "Fetch" failed on source OCI layout "test_broken" (reference: "v1"): sha256:f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2: application/vnd.oci.image.layer.v1.tar: not found

Push error

Before

$ oras push localhost:9999/test:v1
Error: failed to perform "Exists" on destination: Head "http://localhost:9999/v2/test/manifests/sha256:8d0f5635872556d472bbc0353a77f407d1f7a5ace6aab91b238da0f13d4826fb": dial tcp 127.0.0.1:9999: connect: connection refused

After

$ oras push localhost:9999/test:v1
Error: operation "Exists" failed on destination remote repository "localhost:9999/test" (reference: "v1"): Head "http://localhost:9999/v2/test/manifests/sha256:ccb984cadd844dbe41f9f0d2a5a7f935c451053386d076e3cdd07faba7cba084": dial tcp 127.0.0.1:9999: connect: connection refused

Attach error

Before

$ oras attach docker.io/library/hello-world:latest test --artifact-type "test/type"
Error response from registry: failed to perform "Push" on destination: unauthorized: authentication required: [map[Action:pull Class: Name:library/hello-world Type:repository] map[Action:push Class: Name:library/hello-world Type:repository]]

After

$ oras attach docker.io/library/hello-world:latest test --artifact-type "test/type"
Error: operation "Push" failed on destination remote repository "docker.io/library/hello-world" (reference: "latest"): unauthorized: authentication required: [map[Action:pull Class: Name:library/hello-world Type:repository] map[Action:push Class: Name:library/hello-world Type:repository]]

Wwwsylvia avatar Jun 18 '25 12:06 Wwwsylvia

Thanks @Wwwsylvia for proposing the error message improvement.

From an user Pov, I think it would be good to make the error message more friendly and actionable by following this guidance https://github.com/oras-project/oras/blob/main/docs/proposals/error-handling-guideline.md.

For the first example, the error message is too verbose and technical: "operation 'FetchReference'" and "source remote repository" are not terms most users will immediately understand. The image name and tag are repeated. In additonal, it doesn't help the user figure out why it's not found or what to do next.

How about:

$ oras cp docker.io/library/hello-world:v1 localhost:5000/test:v1
Error: Could not find image "hello-world:v1" on the source registry.
Please check if the tag "v1" exists in the repository "library/hello-world" and you have access to the repository.

For the second example, "operation 'Exists'" and full HEAD URL are internal implementation details. Long nested error without clear suggestion seems not helpful to users.

How about:

$ oras push localhost:9999/test:v1
Error: Could not connect to target registry at localhost:9999.
Connection was refused. Is the registry available?

For the third example, map[Action:pull ...] are internal Go struct dumps and somehow confusing to end users. It doesn’t tell users how to mitigate the error.

How about:

$ oras attach docker.io/library/hello-world:latest test --artifact-type "test/type"
Error: Unauthorized to atach artifacts to an image in the registry "docker.io/library/hello-world:latest".
You need right access permissions of the registry to attact to an image.
Please log in using `oras login` and make sure your account has write access to the repository.

FeynmanZhou avatar Jun 19 '25 08:06 FeynmanZhou

@FeynmanZhou Thanks for the suggestions. I understand that actionable error messages would be much more user-friendly.

However, there is a challenge: the copy function operates like a black box to the CLI, and the error types it returns are varied and dynamic. From the CLI's perspective, it is difficult to determine the exact error type and provide a specific actionable message.

For example, for the command oras cp src.registry.com/hello:latest dst.registry.com/hello:latest, the copy function can return a "not found" error in the Err field in the following scenarios:

  • The source tag latest is not found (Op = "Resolve", Err = "latest not found").
  • The source tag is resolved to a digest, but the digest is not found (Op = "Fetch", Err = "sha256:abc123 not found").
  • One of the layers in the source artifact is not found (Op = "Fetch", Err = "sha256:def456 not found").

Handling all possible error types and providing specific actionable messages for each case requires significant effort.

I think we can start with a generic error message and improve it with more friendly customization and action recommendations for the most common cases.

Wwwsylvia avatar Jun 19 '25 11:06 Wwwsylvia

I think we can start with a generic error message and improve it with more friendly customization and action recommendations for the most common cases.

@Wwwsylvia Sounds good to me. I also agree with "the template should clearly indicates which operation failed, where it occurred, and the reason for the failure". Can we agree on the following guiding principles? Are these feasible?

  • Avoid too verbose and technical terms in the error message, such as "operation 'FetchReference'". Operation-level information can be a part of the debug logs for developers. If necessary, we could prompt users in the error message to use -d to get debug logs
  • Do not output the internal Go struct dumps map[Action:pull ...]. Similar as above
  • Keep the error message short and neat. Avoid repeated information
  • If the error type can be identified in a few common cases, provide actionable suggestions as much as we can

FeynmanZhou avatar Jun 20 '25 06:06 FeynmanZhou

Operation-level information can be a part of the debug logs for developers. If necessary, we could prompt users in the error message to use -d to get debug logs

IMO Including operation information in the error message is not harmful. If we only include it in the debug logs, users would need to rerun the failed command with --debug to get the information.

Do not output the internal Go struct dumps map[Action:pull ...].

The detailed error is part of the underlying Err field returned by oras-go. In this case, the error unauthorized: authentication required: [map[Action:pull Class: Name:library/hello-world Type:repository] map[Action:push Class: Name:library/hello-world Type:repository]] originates from the registry response.

Keep the error message short and neat. Avoid repeated information

Do you have any suggestions for improving the error message template? Error: operation "<Op>" failed on <Origin> <target_type> "<target_path>" (reference: "<target_reference>"): <Err>

Wwwsylvia avatar Jun 20 '25 13:06 Wwwsylvia

Per our discussion, here is what we aligned for the error message structure

{Error}: {Describe what happended} 
{Error response from registry}: {Error description (HTTP status code can be printed out if any)}
{{Inner Error}: {Describe what happended behand the scenes to help troubleshoot the root cause. This error message generated by the underlying library such as oras-go.}
[Usage: {Command usage}]: {If wrong command or flag used}
[{Recommended solution}]: {Optional. If the error type can be identified in a few common cases, provide actionable suggestions as much as we can.}
$ oras cp docker.io/library/hello-world:v1 localhost:5000/test:v1
 
Error: failed to copy from source remote repository "docker.io/library/hello-world" (reference: "v1")
Inner error: operation "FetchReference" failed: docker.io/library/hello-world:v1: not found
Error from registry: unauthorized: authentication required: [map[Action:pull Class: Name:library/hello-world Type:repository] map[Action:push Class: Name:library/hello-world Type:repository]]

FeynmanZhou avatar Jun 24 '25 06:06 FeynmanZhou

I think printing stuff like [map[Action:pull Class: Name:library/hello-world Type:repository] map[Action:push Class: Name:library/hello-world Type:repository] is a pretty terrible user experience. Most of our users are not programmers and that would be confusing.

TerryHowe avatar Jun 27 '25 13:06 TerryHowe

I also think that since the user specified "docker.io/library/hello-world:v1" it is clearer than "docker.io/library/hello-world" (reference: "v1")

We wouldn't really know if the reference or the repository were missing.

TerryHowe avatar Jun 27 '25 13:06 TerryHowe

I think printing stuff like [map[Action:pull Class: Name:library/hello-world Type:repository] map[Action:push Class: Name:library/hello-world Type:repository] is a pretty terrible user experience. Most of our users are not programmers and that would be confusing.

The error response [map[Action:pull Class: Name:library/hello-world Type:repository] map[Action:push Class: Name:library/hello-world Type:repository]] is directly returned from the registry (in this case, docker.io). We just print it out as it is. Different registries may return different messages.

I might have picked a bad instance, but I don't think we should hide the error messages from the registry.

Wwwsylvia avatar Jul 01 '25 07:07 Wwwsylvia

OK. Let's keep it simple.

How about this error message format:

Error from {source|destination} {registry|OCI layout} for "{reference}": {error}

Examples

Copy Error

$ oras cp docker.io/library/hello-world:v1 localhost:5000/test:v1
Error from source registry for "docker.io/library/hello-world:v1": docker.io/library/hello-world:v1: not found

Pull Error

$ oras pull --oci-layout test_broken:v1
Error from source OCI layout for "test_broken:v1": sha256:f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2: application/vnd.oci.image.layer.v1.tar: not found

Push Error

$ oras push localhost:9999/test:v1
Error from destination registry for "localhost:9999/test:v1": Head "http://localhost:9999/v2/test/manifests/sha256:b4c25e2c630258c6bbddafa1dee0808279d74b28d9cf2bf32898fa762a6b176b": dial tcp 127.0.0.1:9999: connect: connection refused

Attach Error

$ oras attach ghcr.io/oras-project/oras:main --artifact-type "test/type" test
Error from destination registry for "docker.io/library/hello-world:latest": unauthorized: authentication required: [map[Action:pull Class: Name:library/hello-world Type:repository] map[Action:push Class: Name:library/hello-world Type:repository]]

@TerryHowe @FeynmanZhou @shizhMSFT @qweeah What do you think?


Update: For OCI layout, maybe using the word oci-layout instead of OCI layout to be consistent with other output

Error from source oci-layout for "test_broken:v1": sha256:f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2: application/vnd.oci.image.layer.v1.tar: not found

Wwwsylvia avatar Jul 04 '25 07:07 Wwwsylvia

@Wwwsylvia The error message format looks good to me. Is it necessary to apply {source|destination} to oras pull/push/attach? I think this format only works for oras copy since it has two references.

FeynmanZhou avatar Jul 08 '25 01:07 FeynmanZhou