transfermarkt-datasets
transfermarkt-datasets copied to clipboard
Setup public access to DVC assets
In order for everyone to be able to access DVC assets and appeareces snapshots, allow public access on the bucket
A few things to consider
- https://www.reddit.com/r/aws/comments/99ird9/is_it_safe_to_have_an_s3_bucket_with_private
- ~~Is S3 throttling a possible solution to avoid incurring into big costs buts still allowing public access?~~
- ~~AWS Bugdet actions seem to be way to go for implementing a security shutdown of the bucket access if traffic increases too much here~~
- This
AWS Budget actions does not support triggering a change in the ACL of an S3 bucket (the way to change the status of a bucket from public to private). Explore the option of setting up a simple lambda that subscribes to AWS Budget events to do the thing. Useful resources
Though still a nice to have feature, I do not see it as a priority for the near future considering that #44 enables users to request access to the backend by simply raising a PR.
A short documentation section for self-granting access to DVC assets can be found in the README.
One option we did not consider enough is to setup a Cloudfront distribution to enable HTTP based access, and use some rate-based rules to limit the number of requests, for example, per IP.
Resources
- Troubleshooting access denied errors via Cloudfront (link)