S3 dataset access
Hi I understand the dataset can be streamed from S3, following the example in the docs I get an error, and assume access must be granted?
> aws s3 ls s3://clay-tiles-02/02/27WXN/
An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
Hi Rob!! I think the right move here is to copy a representative sample of embeddings to source.coop
I don't think if it makes sense to publicly host a copy of the whole training set publicly, when is just a cropped selection of data already available. E.g. on v1 we have 50M chips and we are anyways moving towards streaming from source COGs into the GPUs on training. https://github.com/Clay-foundation/stacchip
In the meantime I've just activated requester pays on this bucket.
@brunosan I get an error:
⚡ ~/Clay-Foundation-Model aws s3 ls s3://clay-tiles-02/02/27WXN/ --request-payer requester
An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
@robmarkcole for Clay v1 we do not recommend using these datacubes anymore. The input ca be generated much more flexible and adapted to the use case. As described in the following tutorial.
https://clay-foundation.github.io/model/tutorials/clay-v1-wall-to-wall.html
Please let us know if we can help you with testing Clay v1, happy to advise on data preparation for your use case if you have questions!