clearml
clearml copied to clipboard
StorageManager cannot work with MinIO
Hi,
Describe the bug
i want to store all task, model, dataset, etc to minio. I see on document, StorageManager to manage upload, download data to our storage platform (i use minio). this is my config:
aws {
s3 {
# use_credentials_chain: false
region: "my_xx"
key: "my_xx"
secret: "my_xx"
credentials: [
{
# This will apply to all buckets in this host (unless key/value is specifically provided for a given bucket)
host: "minio_ip:9000"
key: "my_xx"
secret: "my_xx"
region: "my_xx"
multipart: false
secure: false
}
]
}
}
To reproduce
i run this code to try upload but not works.
from clearml import StorageManager
manager = StorageManager()
manager.upload_folder(
local_folder="/home/****/project1/dataset",
remote_url="s3://ip_minio:9000/test-bucket/dateset-clearml"
)
and return the error
...
Failed uploading: Could not connect to the endpoint URL: "http://s3.region-name.amazonaws.com/test-bucket/dateset-clearml/lq/seg_151a8cb2b37f4de6954aaf9fd6e0ee48_1.jpg"
Failed uploading: Could not connect to the endpoint URL: "http://s3.region-name.amazonaws.com/test-bucket/dateset-clearml/lq/seg_9f76595542e342c4a61f83a4b11c9e11_2.jpg"
...
the url seems still using aws, not minio_ip. is there i miss for config in MinIO?
Environment
- Server : self hosted
- ClearML SDK Version 1.6.4
- ClearML Server Version: WebApp: 1.6.0-213 • Server: 1.6.0-213 • API: 2.20
- Python Version 3.10.4
- OS Linux
Thanks.
I test upload using boto3, and it works
import boto3 # version: 1.24.64
from botocore.client import Config
s3 = boto3.resource('s3',
endpoint_url='http://minio_ip:9000',
aws_access_key_id='ss',
aws_secret_access_key='ss',
config=Config(signature_version='s3v4'),
region_name='region_name')
s3.Bucket('test-bucket').upload_file('dataset/lq/seg_9f76595542e342c4a61f83a4b11c9e11_2.jpg','img2.jpg')
for obj in s3.Bucket('test-bucket').objects.all():
print(obj.key, obj.last_modified)
after debugging, i found when creating resource for boto3driver, its not contain the region_name. https://github.com/allegroai/clearml/blob/a6104347f29d2fa27ddea5c6ff73d6138c558f8b/clearml/storage/helper.py#L1404-L1430
and after i harcode the code to add region_name, the storage manager able to upload the file to minio.
with self._creation_lock:
boto_kwargs = {
"endpoint_url": endpoint,
"region_name" : cfg.region,
"use_ssl": cfg.secure,
"verify": cfg.verify,
"config": botocore.client.Config(
max_pool_connections=max(
int(_Boto3Driver._min_pool_connections),
int(_Boto3Driver._pool_connections)),
connect_timeout=int(_Boto3Driver._connect_timeout),
read_timeout=int(_Boto3Driver._read_timeout),
)
}
if not cfg.use_credentials_chain:
boto_kwargs["aws_access_key_id"] = cfg.key
boto_kwargs["aws_secret_access_key"] = cfg.secret
if cfg.token:
boto_kwargs["aws_session_token"] = cfg.token
self.resource = boto3.resource(
"s3",
**boto_kwargs
)
self.config = cfg
bucket_name = self.name[len(cfg.host) + 1:] if cfg.host else self.name
self.bucket = self.resource.Bucket(bucket_name)
for my case its works, but i dont know that will impact another.
Hi @muhammadAgfian96, thanks for this report and detailed info! We'll test this and release a fix as soon as possible 🙂
Great! Thanks
Hi, Any updates on this? Thanks.
Hi @muhammadAgfian96,
Sorry for the late reply, this is planned to be released in the next version. We should update here once it's out.
Hi @muhammadAgfian96,
can you try with latest released RC and let us know if the issue is resolved?
Hi @erezalg , i have test upload_file, upload_folder and download_folder. all works. Thanks!
https://clearml.slack.com/archives/CTK20V944/p1670677722318369
@muhammadAgfian96 just checked the Slack thread and seems like 1.8.4rc0 solved the issue, is this correct? If so, can this issue be closed?
Hi @erezalg, yes