earthkit-data
earthkit-data copied to clipboard
Implement Amazon S3 bucket source
Still work in progress.
The new source for an S3 bucket can be used like this:
import earthkit.data
# endpoint="s3.amazonaws.com"
bucket_name = "ecmwf-forecasts"
key = "20240111/00z/0p4-beta/oper/20240111000000-0h-oper-fc.grib2"
r = {"bucket": bucket_name,
"objects": [
{"object": key}
],
}
ds = earthkit.data.from_source("s3", r, stream=False, anon=True)
ds.ls()
More examples are available at: https://earthkit-data.readthedocs.io/en/feature-s3/examples/s3.html
- Multiple buckets and objects can be used
- A single part can be specified for an object as
"objects": [
{"object": key, "start": 0, "range": 438714}
],
- The default endpoint is
s3.amazonaws.com
. Other endpoints can be specified in the request as:
r = {"bucket": bucket_name,
"endpoint": "my_endpoint",
"objects": [
....
- The
stream
option controls if the data is read as a stream or downloaded to a file. The default isstream=True
- The
anon
option controls whether if it is an anonymous access or AWS credentials should be used. The default isanon=True
. Handling the credentials requires theaws-requests-auth
andbotocore
packages