[Feature Request]: Manage GCS soft delete policy in temp location
What would you like to happen?
Background
GCS has introduced soft delete policy feature in new and existing buckets since March 2024. By default, the soft delete retention period is 7 days. (https://cloud.google.com/resources/storage/soft-delete-announce?hl=en)
This feature may impact Beam users who use temp and staging locations in GCS for temporary files and pipeline execution, because storage during the soft delete window will still be billed.
Tasks
- [ ] Disable soft delete policy when creating a new default bucket for a project
- [ ] Warn users about potential costs if the temporary location is in a bucket with the soft delete policy enabled
Issue Priority
Priority: 2
Issue Components
- [X] Component: Python SDK
- [X] Component: Java SDK
- [X] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam YAML
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner
.take-issue
What is the status here?
Everything is done, except for adding a warning to Python SDK if the bucket has soft delete enabled.
I am asking because it is targeting 2.57.0 release. Are we able to cherry-pick? Or is it good enough having some functionality on 2.57.0 branch?
The three merged commits should be good for cherry-picking. I am waiting for the fourth(last) one to be reviewed. If it can get through today or tomorrow, can we cherry pick it as well?
Ok. the remaining task is done and merged. I think we are good to close this issue @kennknowles.
SGTM