transfermarkt-datasets icon indicating copy to clipboard operation
transfermarkt-datasets copied to clipboard

Setup public access to DVC assets

Open dcaribou opened this issue 4 years ago • 1 comments

In order for everyone to be able to access DVC assets and appeareces snapshots, allow public access on the bucket

A few things to consider

  • https://www.reddit.com/r/aws/comments/99ird9/is_it_safe_to_have_an_s3_bucket_with_private
  • ~~Is S3 throttling a possible solution to avoid incurring into big costs buts still allowing public access?~~
  • ~~AWS Bugdet actions seem to be way to go for implementing a security shutdown of the bucket access if traffic increases too much here~~
  • This

AWS Budget actions does not support triggering a change in the ACL of an S3 bucket (the way to change the status of a bucket from public to private). Explore the option of setting up a simple lambda that subscribes to AWS Budget events to do the thing. Useful resources

  • Setting up a lambda with Terraform (link)
  • Setting up a subscription from SNS to a lambda with Terraform (link)
  • Lambda deployment packages (link)
  • Lambda development tools (link)
  • Terraform example (link)

dcaribou avatar Jan 23 '21 11:01 dcaribou

Though still a nice to have feature, I do not see it as a priority for the near future considering that #44 enables users to request access to the backend by simply raising a PR.

A short documentation section for self-granting access to DVC assets can be found in the README.

dcaribou avatar Sep 03 '21 09:09 dcaribou

One option we did not consider enough is to setup a Cloudfront distribution to enable HTTP based access, and use some rate-based rules to limit the number of requests, for example, per IP.

Resources

  • Troubleshooting access denied errors via Cloudfront (link)

dcaribou avatar Sep 22 '23 16:09 dcaribou