feat(core): Implement list with deleted and versions for gcs
Which issue does this PR close?
Part of #5496
Rationale for this change
Implement the gcs in RFC we need.
What changes are included in this PR?
Implement list with deleted for gcs services Implement versions for gcs services
Are there any user-facing changes?
Users can now list deleted files and user versions from gcs services.
CodSpeed Performance Report
Merging #5548 will not alter performance
Comparing gcs-list-with-deleted (8b0d9cc) with main (b8a3b7a)
Summary
✅ 73 untouched benchmarks
@Xuanwo Any other questions?
@Xuanwo Any other questions?
There are two things in my mind now:
The frist, do we really need an extra GcsVersionsLister? It seems nearly the same with GcsLister.
Then, to support list_with_deleted, we may need to use gcs's softDeleted.
However, this API is
If true, only returns soft-deleted objects as part of the objects list response.
And
The softDeleted parameter can only be used successfully if the bucket has a soft delete policy. Otherwise, the request fails with a 400 Bad Request error with the reason invalidArgument. If true, versions cannot be set to true.
For GCS, list_with_deleted is a separate feature that needs to be enabled independently. I believe it would be better to remove this feature for now and reconsider it in the future.
There are two things in my mind now:
GcsVersionsListerandGcsListerhave different processing strategies.- Gcs's Object Versioning can also control deletion, and is more in line with OpenDAL's semantics (
softDeletedhas an expiration time).
GcsVersionsListerandGcsListerhave different processing strategies.
Hi, s3 split them because we need two different API calls, requiring distinct parsing logic. However, GCS doesn’t have this issue; we can simply unify them based on different OpList inputs.
2. Gcs's Object Versioning can also control deletion, and is more in line with OpenDAL's semantics (
softDeletedhas an expiration time).
Oh, you are right. soft_delete is another new feature.
we can simply unify them based on different
OpListinputs.
If we get versioned output, we also need to group and sort the entities, and the processing flow is different from non-versioned.
If we get versioned output, we also need to group and sort the entities, and the processing flow is different from non-versioned.
Hey, I don't understand. We should definitely avoid sorting the entries, as it could change the output order. In fact, the versioned output should be the same as the non-versioned output; the only difference is that in the non-versioned output (no versions, no deleted), we only have files where is_current is true.
We should definitely avoid sorting the entries
Yes, I understood it wrong, and made some changes, plz review it.
Hi, seems our tests failed for:
Unsupported (persistent) at Deleter::delete => The service gcs does not support the operation Deleter::delete with the arguments version. Please verify if the relevant flags have been enabled, or submit an issue if you believe this is incorrect.
Hi @hoslo, sorry for the late. Would you like to resolve the conflicts? I will try to find sometime to setup the CI.
Hi @hoslo, sorry for the late. Would you like to resolve the conflicts? I will try to find sometime to setup the CI.
OK, I'll resolve this conflict.