MinIO: ObjectHashMismatchError when trying to upload object
Summary
The MinIO provider raises an ObjectHashMismatchError when trying to upload an object, for no clear reason
100% reproducible, code to repro the bug is provided
Detailed Information
libcloud: latest stable, 3.3.1 Python: 3.8.10 OS: Ubuntu 20.04
Here you can find a repo to repro the bug, using pytest (Actually there are 2 bugs to report):
https://github.com/Wenzel/libcloud_bug
$ pytest -k test_demo_ObjectHashMismatchError_with_pyfakefs
clean_minio_db = <libcloud.storage.drivers.minio.MinIOStorageDriver object at 0x7f1410f3fdf0>, fs = <pyfakefs.fake_filesystem.FakeFilesystem object at 0x7f141021fa90>
def test_demo_ObjectHashMismatchError_with_pyfakefs(clean_minio_db, fs):
# create test file
test_file = "/file1.txt"
test_file_data = b"hello"
with open(test_file, "wb") as f:
f.write(test_file_data)
# create container
driver = clean_minio_db
container = driver.create_container('test')
# test
> driver.upload_object(str(test_file), container, 'test_file')
/home/wenzel/Projets/libcloud_bug/tests/test_demo_bug.py:13:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/home/wenzel/.cache/pypoetry/virtualenvs/libcloud-bug-IT-EyGKG-py3.8/lib/python3.8/site-packages/libcloud/storage/drivers/s3.py:545: in upload_object
return self._put_object(container=container, object_name=object_name,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <libcloud.storage.drivers.minio.MinIOStorageDriver object at 0x7f1410f3fdf0>, container = <Container: name=test, provider=MinIO Storage Driver>, object_name = 'test_file', method = 'PUT'
query_args = None, extra = {}, file_path = '/file1.txt', stream = None, verify_hash = True, storage_class = None, headers = {'connection': 'close', 'content-type': 'text/plain; charset=utf-8'}
def _put_object(self, container, object_name, method='PUT',
query_args=None, extra=None, file_path=None,
stream=None, verify_hash=True, storage_class=None,
headers=None):
headers = headers or {}
extra = extra or {}
headers.update(self._to_storage_class_headers(storage_class))
content_type = extra.get('content_type', None)
meta_data = extra.get('meta_data', None)
acl = extra.get('acl', None)
if meta_data:
for key, value in list(meta_data.items()):
key = self.http_vendor_prefix + '-meta-%s' % (key)
headers[key] = value
if acl:
headers[self.http_vendor_prefix + '-acl'] = acl
request_path = self._get_object_path(container, object_name)
if query_args:
request_path = '?'.join((request_path, query_args))
result_dict = self._upload_object(
object_name=object_name, content_type=content_type,
request_path=request_path, request_method=method,
headers=headers, file_path=file_path, stream=stream)
response = result_dict['response']
bytes_transferred = result_dict['bytes_transferred']
headers = response.headers
response = response
server_hash = headers.get('etag', '').replace('"', '')
server_side_encryption = headers.get('x-amz-server-side-encryption',
None)
aws_kms_encryption = (server_side_encryption == 'aws:kms')
hash_matches = (result_dict['data_hash'] == server_hash)
# NOTE: If AWS KMS server side encryption is enabled, ETag won't
# contain object MD5 digest so we skip the checksum check
# See https://docs.aws.amazon.com/AmazonS3/latest/API
# /RESTCommonResponseHeaders.html
# and https://github.com/apache/libcloud/issues/1401
# for details
if verify_hash and not aws_kms_encryption and not hash_matches:
> raise ObjectHashMismatchError(
value='MD5 hash {0} checksum does not match {1}'.format(
server_hash, result_dict['data_hash']),
E libcloud.storage.types.ObjectHashMismatchError: <ObjectHashMismatchError in <libcloud.storage.drivers.minio.MinIOStorageDriver object at 0x7f1410f3fdf0>, value=MD5 hash checksum does not match 5d41402abc4b2a76b9719d911017c592, object = test_file>
/home/wenzel/.cache/pypoetry/virtualenvs/libcloud-bug-IT-EyGKG-py3.8/lib/python3.8/site-packages/libcloud/storage/drivers/s3.py:922: ObjectHashMismatchError
------------------------------------------------------------------------------------------- Captured stdout setup -------------------------------------------------------------------------------------------
db2dfd078b96e1b0932dc354926aadb77adba2d8c19b8ccc875e7fe5163e8f46
----------------------------------------------------------------------------------------- Captured stdout teardown ------------------------------------------------------------------------------------------
libcloud_bug_objectmistmatch_miniodb
========================================================================================== short test summary info ==========================================================================================
FAILED tests/test_demo_bug.py::test_demo_ObjectHashMismatchError_with_pyfakefs - libcloud.storage.types.ObjectHashMismatchError: <ObjectHashMismatchError in <libcloud.storage.drivers.minio.MinIOStorageD..
It looks like there is a hash mismatch when upload_object is verifying the hash, but for no clear reason.
Thanks for maintaining libcloud !
Thanks for contributing to this issue. As it has been 90 days since the last activity, we are automatically marking is as stale. If this issue is not relevant or applicable anymore (problem has been fixed in a new version or similar), please close the issue or let us know so we can close it. On the contrary, if the issue is still relevant, there is nothing you need to do, but if you have any additional details or context which would help us when working on this issue, please include it as a comment to this issue.