opendal icon indicating copy to clipboard operation
opendal copied to clipboard

bug: `invalid uri character` return when list cos objects contains special key

Open thexiay opened this issue 6 months ago • 3 comments

Describe the bug

When list some path contains chinese chars, operator list operation will return invalid uri character

Steps to Reproduce

.... // construct Operator blob
let blob= Cos::default()
    .root(root)
    .bucket(bucket)
    .endpoint(endpoint)
    .secret_id(access_key)
    .secret_key(secret_key);
const SPECIAL_KEY: &str = "测试编码文件aaaaaa.jpg";

let br = Buffer::from("1");
blob.write(SPECIAL_KEY, br).await?;
blob.list("测试编码文件").await?;

Expected Behavior

should return normally without error

Additional Context

No response

Are you willing to submit a PR to fix this bug?

  • [ ] Yes, I would like to submit a PR.

thexiay avatar Jun 04 '25 05:06 thexiay

Hi @thexiay! I'm Dosu and I’m helping the opendal team.

The invalid uri character error you’re seeing when listing COS objects with Chinese characters is likely caused by how OpenDAL encodes special characters in URIs. OpenDAL uses percent encoding for all non-ASCII characters in paths, following the encodeURIComponent standard, which should handle Chinese characters correctly by encoding them as percent sequences (e.g., %E6%B5%8B%E8%AF%95 for "测试") reference.

However, if the backend (in this case, COS) or any intermediate component expects a different encoding or does not properly decode percent-encoded paths, it can result in errors like invalid uri character. The list operation in OpenDAL specifically uses this percent encoding when constructing the URL for listing objects reference.

There are a few things you can check or try as a workaround:

  • Ensure that the COS endpoint you are using supports percent-encoded UTF-8 paths and is not expecting raw (unencoded) Unicode characters.
  • Double-check that the bucket and endpoint configuration in your Operator matches the expected region and encoding requirements for your COS instance.
  • If you have access to request logs or error messages from COS, see if they provide more detail about which character or sequence is considered invalid.

There are no recent changes in OpenDAL directly related to URI encoding that would obviously introduce this bug, and similar encoding logic is used for other backends without this specific error being reported.

If you can capture the exact URL being sent (for example, by enabling debug logging), it may help pinpoint whether the encoding is incorrect or if the issue is on the COS side. Improving error messages to include the problematic URL has also been discussed in the project reference.

If you have further details about the COS configuration or logs, sharing them could help narrow down the root cause.

To reply, just mention my name @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 04 '25 05:06 dosubot[bot]

Thank you @thexiay for bringing this up. I believe it's a similar issue to the one we're experiencing in https://github.com/apache/opendal/issues/5391. We can fix it like https://github.com/Xuanwo/reqsign/pull/515

Xuanwo avatar Jun 04 '25 05:06 Xuanwo

It can also be very helpful to add such a test to our behavioral test suites so that we can catch it more easily.

Xuanwo avatar Jun 04 '25 05:06 Xuanwo