Azurite icon indicating copy to clipboard operation
Azurite copied to clipboard

delete_blobs method raises an exception PartialBatchErrorException when using Azurite

Open RobertM15 opened this issue 2 years ago • 4 comments

Which service(blob, file, queue, table) does this issue concern?

blob

Which version of the Azurite was used?

3.22.0

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

DockerHub

What's the Node.js version?

NA

What problem was encountered?

Azurite returns an error when calling delete_blobs()

Traceback (most recent call last):
  File "/home/myLocalPath/.config/JetBrains/PyCharm2022.3/scratches/azurite.py", line 30, in <module>
    container_client.delete_blobs(*blob_list)
  File "/home/myLocalPath/.virtualenvs/py310/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/home/myLocalPath/.virtualenvs/py310/lib/python3.10/site-packages/azure/storage/blob/_container_client.py", line 1360, in delete_blobs
    return self._batch_send(*reqs, **options)
  File "/home/myLocalPath/.virtualenvs/py310/lib/python3.10/site-packages/azure/storage/blob/_shared/base_client.py", line 319, in _batch_send
    process_storage_error(error)
  File "/home/myLocalPath/.virtualenvs/py310/lib/python3.10/site-packages/azure/storage/blob/_shared/response_handlers.py", line 181, in process_storage_error
    exec("raise error from None")   # pylint: disable=exec-used # nosec
  File "<string>", line 1, in <module>
  File "/home/myLocalPath/.virtualenvs/py310/lib/python3.10/site-packages/azure/storage/blob/_shared/base_client.py", line 315, in _batch_send
    raise error
azure.storage.blob._shared.response_handlers.PartialBatchErrorException: There is a partial failure in the batch operation.
ErrorCode:None
Content: --batch_7240c4fb-b11c-11ed-ac16-d39098213827
Content-Type: application/http

HTTP/1.1 400 One of the request inputs is not valid.
x-ms-error-code: InvalidInput
x-ms-request-id: 89d6790c-18d2-4a2c-88b3-80c73e5614ab
content-type: application/xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Error>
  <Code>InvalidInput</Code>
  <Message>One of the request inputs is not valid.
RequestId:89d6790c-18d2-4a2c-88b3-80c73e5614ab
Time:2023-02-20T12:45:25.460Z</Message>
</Error>
--batch_7240c4fb-b11c-11ed-ac16-d39098213827--

Logs

Debug logs: debug.log

Azurite Blob service is starting on 0.0.0.0:10000
Azurite Blob service successfully listens on http://0.0.0.0:10000
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container?restype=container HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "GET /devstoreaccount1/container?restype=container&comp=list HTTP/1.1" 200 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob0 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob1 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob2 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob3 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob4 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob5 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob6 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob7 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob8 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "PUT /devstoreaccount1/container/my_blob9 HTTP/1.1" 201 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "GET /devstoreaccount1/container?restype=container&comp=list HTTP/1.1" 200 -
172.17.0.1 - - [20/Feb/2023:12:45:25 +0000] "POST /devstoreaccount1/container?restype=container&comp=batch HTTP/1.1" 202 -

Steps to reproduce the issue?

I used script below to test it with azurite and my storage account. Azurite sends 400 in response, but on storage account the script works fine and deletes blobs.

from azure.core.exceptions import ResourceExistsError

from azure.storage.blob import BlobServiceClient

blob_service_client = BlobServiceClient.from_connection_string((
    "DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;"
    "AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/"
    "K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;"
))

container_client = blob_service_client.get_container_client("container")

try:
    container_client.create_container()
except ResourceExistsError:
    pass

blob_list = [b.name for b in list(container_client.list_blobs())]
# empty list
print(blob_list)

upload_data = b"Hello World"
for index in range(10):
    container_client.upload_blob(name=f"my_blob{index}", data=upload_data, overwrite=True)

# uploaded blobs
blob_list = [b.name for b in list(container_client.list_blobs())]
print(blob_list)

container_client.delete_blobs(*blob_list)

# empty again
blob_list = [b.name for b in list(container_client.list_blobs())]
print(blob_list)

Have you found a mitigation/solution?

In patch side effect I iterate and remove single blob. It's global patch in TestRunner

    def delete_azure_blobs(self, *blobs, **kwargs):
        for blob in blobs:
            with suppress(ResourceNotFoundError):
                self.delete_blob(blob)

    @patch(
        "azure.storage.blob._container_client.ContainerClient.delete_blobs",
        side_effect=delete_azure_blobs,
        autospec=True,
    )
    def run_tests(self, test_labels, mock, **kwargs):
         return super().run_tests(test_labels, **kwargs)

RobertM15 avatar Feb 21 '23 14:02 RobertM15

@EmmaZhu Would you please help to follow up this issue?

blueww avatar Feb 22 '23 02:02 blueww

Hi @RobertM15 ,

The request format sent from the python SDK is different from .Net and JS SDK. A batch request body from python SDK looks like:

--batch_2c203f2b-b344-11ed-8810-64006a946869
Content-Type: application/http
Content-ID: 0
Content-Transfer-Encoding: binary

DELETE /container/my_blob0? HTTP/1.1
x-ms-date: Thu, 23 Feb 2023 06:34:49 GMT
x-ms-client-request-id: 2c234da4-b344-11ed-a3ef-64006a946869
Authorization: SharedKey devstoreaccount1:HxSV3AEd/dKNinfmjK5UauP2oXN/g6Y9GeToM1Vg+p8=
Content-Length: 0


--batch_2c203f2b-b344-11ed-8810-64006a946869
Content-Type: application/http
Content-ID: 1
Content-Transfer-Encodi…ZwEwOyOgAz8kY1EByPeoOFhoKeCm4lku/CZa654fU=
Content-Length: 0


--batch_2c203f2b-b344-11ed-8810-64006a946869
Content-Type: application/http
Content-ID: 9
Content-Transfer-Encoding: binary

DELETE /container/my_blob9? HTTP/1.1
x-ms-date: Thu, 23 Feb 2023 06:34:49 GMT
x-ms-client-request-id: 2c263268-b344-11ed-a867-64006a946869
Authorization: SharedKey devstoreaccount1:vNxUXqG6ZpPQ2XRUC5KWT8kXevSj9tSdhRvWkyRRcjk=
Content-Length: 0


--batch_2c203f2b-b344-11ed-8810-64006a946869--

A batch request body from .Net SDK looks like:

--batch_1470d1ae-61c9-42a9-9236-9b5067adfadb
Content-Type: application/http
Content-Transfer-Encoding: binary
Content-ID: 0

DELETE devstoreaccount1/container/blob_1_0 HTTP/1.1
x-ms-date: Thu, 23 Feb 2023 06:37:06 GMT
Accept: application/xml
x-ms-client-request-id: c2b70197-5954-4fcb-bd83-981cc8dc6ce5
x-ms-return-client-request-id: true
Authorization: SharedKey devstoreaccount1:0VCMtZeJf5hpe461tYUqrOgspXHDdnxtR2tBi2gMsWY=
Content-Length: 0

--batch_1470d1ae-61c9-42a9-9236-9b5067adf…470d1ae-61c9-42a9-9236-9b5067adfadb
Content-Type: application/http
Content-Transfer-Encoding: binary
Content-ID: 19

DELETE devstoreaccount1/container/my_blob9 HTTP/1.1
x-ms-date: Thu, 23 Feb 2023 06:37:06 GMT
Accept: application/xml
x-ms-client-request-id: 16d633d2-37dd-4ae4-b12b-44983e929598
x-ms-return-client-request-id: true
Authorization: SharedKey devstoreaccount1:jXokRNtvy7bwqkQAT4JTI9UlLSNj7Q1peM/EfjwDKgM=
Content-Length: 0

--batch_1470d1ae-61c9-42a9-9236-9b5067adfadb--

Path for each subrequest are different. Python SDK's subrequest path is DELETE /container/my_blob0? HTTP/1.1, .Net SDK's path is DELETE devstoreaccount1/container/my_blob9 HTTP/1.1.

From the REST API document: https://learn.microsoft.com/en-us/rest/api/storageservices/blob-batch#request-body image

Seems each subrequest should have the full path instead of only container and blob name.

Azurite cannot get all SDK work on this situation, because we have no way to identify whether the subrequest path includes account name or not. Currently, we'll keep consistent with .Net SDK.

Thanks Emma

EmmaZhu avatar Feb 23 '23 06:02 EmmaZhu

@EmmaZhu Do you mean that this is actually a bug in the Python SDK? Is this bug tracked somewhere?

bluenote10 avatar Jan 11 '24 09:01 bluenote10

@EmmaZhu Do you mean that this is actually a bug in the Python SDK? Is this bug tracked somewhere?

The line with the bug can be found here: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client_helpers.py#L160

f"/{quote(container_name)}/{quote(str(blob_name), safe='/~')}{query_str}",

the url should be /<account_name>/<container_name>/<blob_name>/ By changing the above line to:

f"/{quote(self.account_name)}/{quote(container_name)}/{quote(str(blob_name), safe='/~')}{query_str}",

The issue is fixed

Autom3 avatar Jul 25 '24 10:07 Autom3