opendal icon indicating copy to clipboard operation
opendal copied to clipboard

feat(core): Implement list with deleted and versions for gcs

Open hoslo opened this issue 11 months ago • 11 comments

Which issue does this PR close?

Part of #5496

Rationale for this change

Implement the gcs in RFC we need.

What changes are included in this PR?

Implement list with deleted for gcs services Implement versions for gcs services

Are there any user-facing changes?

Users can now list deleted files and user versions from gcs services.

hoslo avatar Jan 15 '25 07:01 hoslo

CodSpeed Performance Report

Merging #5548 will not alter performance

Comparing gcs-list-with-deleted (8b0d9cc) with main (b8a3b7a)

Summary

✅ 73 untouched benchmarks

codspeed-hq[bot] avatar Jan 15 '25 08:01 codspeed-hq[bot]

@Xuanwo Any other questions?

hoslo avatar Jan 22 '25 03:01 hoslo

@Xuanwo Any other questions?

There are two things in my mind now:

The frist, do we really need an extra GcsVersionsLister? It seems nearly the same with GcsLister.

Then, to support list_with_deleted, we may need to use gcs's softDeleted.

However, this API is

If true, only returns soft-deleted objects as part of the objects list response.

And

The softDeleted parameter can only be used successfully if the bucket has a soft delete policy. Otherwise, the request fails with a 400 Bad Request error with the reason invalidArgument. If true, versions cannot be set to true.

For GCS, list_with_deleted is a separate feature that needs to be enabled independently. I believe it would be better to remove this feature for now and reconsider it in the future.

Xuanwo avatar Jan 22 '25 03:01 Xuanwo

There are two things in my mind now:

  1. GcsVersionsLister and GcsLister have different processing strategies.
  2. Gcs's Object Versioning can also control deletion, and is more in line with OpenDAL's semantics (softDeleted has an expiration time).

hoslo avatar Jan 22 '25 06:01 hoslo

  1. GcsVersionsLister and GcsLister have different processing strategies.

Hi, s3 split them because we need two different API calls, requiring distinct parsing logic. However, GCS doesn’t have this issue; we can simply unify them based on different OpList inputs.

2. Gcs's Object Versioning can also control deletion, and is more in line with OpenDAL's semantics (softDeleted has an expiration time).

Oh, you are right. soft_delete is another new feature.

Xuanwo avatar Jan 22 '25 09:01 Xuanwo

we can simply unify them based on different OpList inputs.

If we get versioned output, we also need to group and sort the entities, and the processing flow is different from non-versioned.

hoslo avatar Jan 23 '25 06:01 hoslo

If we get versioned output, we also need to group and sort the entities, and the processing flow is different from non-versioned.

Hey, I don't understand. We should definitely avoid sorting the entries, as it could change the output order. In fact, the versioned output should be the same as the non-versioned output; the only difference is that in the non-versioned output (no versions, no deleted), we only have files where is_current is true.

Xuanwo avatar Jan 23 '25 07:01 Xuanwo

We should definitely avoid sorting the entries

Yes, I understood it wrong, and made some changes, plz review it.

hoslo avatar Jan 23 '25 08:01 hoslo

Hi, seems our tests failed for:

Unsupported (persistent) at Deleter::delete => The service gcs does not support the operation Deleter::delete with the arguments version. Please verify if the relevant flags have been enabled, or submit an issue if you believe this is incorrect.

Xuanwo avatar Apr 07 '25 12:04 Xuanwo

Hi @hoslo, sorry for the late. Would you like to resolve the conflicts? I will try to find sometime to setup the CI.

Xuanwo avatar May 30 '25 02:05 Xuanwo

Hi @hoslo, sorry for the late. Would you like to resolve the conflicts? I will try to find sometime to setup the CI.

OK, I'll resolve this conflict.

hoslo avatar May 30 '25 02:05 hoslo