Azurite icon indicating copy to clipboard operation
Azurite copied to clipboard

Fails to filter blobs by tags - HTTP 500 with multiple conditions on single tag

Open aaronenberg-msft opened this issue 1 year ago • 9 comments

I am testing azurite 3.33.0 support for finding blobs by index tags and it is failing with a HTTP 500 with this WHERE clause:

@container='mycontainer' AND MyTag >= 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/' AND MyTag < 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/zzzzzzzzzzzzzzzzzzzzzzz'

The error message is: Error: can't have multiple conditions for a single tag unless they define a range

My expectation is that this should succeed given this is the same format used with Azure Blob Storage.

Here is the full debug log for the request:

2024-12-12T19:23:26.907Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobStorageContextMiddleware: RequestMethod=GET RequestURL=http://127.0.0.1/devstoreaccount1/?comp=blobs&where=%40container%3D%27mycontainer%27%20AND%20MyTag%20%3E%3D%20%27Foo%2F296ab642-162c-4db8-a4ae-9517189e411d%2F%27%20AND%20MyTag%20%3C%20%27Foo%2F296ab642-162c-4db8-a4ae-9517189e411d%2Fzzzzzzzzzzzzzzzzzzzzzzz%27&maxresults=100 RequestHeaders:{"host":"127.0.0.1:10000","x-ms-version":"2021-12-02","accept":"application/xml","x-ms-client-request-id":"d132709d-3f10-4efe-86f5-012ada151d3c","x-ms-return-client-request-id":"true","user-agent":"azsdk-net-Storage.Blobs/12.15.1 (.NET 8.0.11; Microsoft Windows 10.0.22631)","x-ms-date":"Thu, 12 Dec 2024 19:23:26 GMT","authorization":"SharedKey devstoreaccount1:19ZJUVfGgitMomyBXGP8nQJ6+hyxu+SLqhHuhNoTijs=","traceparent":"00-3ea7bda368499cfd04c59fd7c1f64610-1307d5eb337eea57-01"} ClientIP=172.17.0.1 Protocol=http HTTPVersion=1.1
2024-12-12T19:23:26.908Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobStorageContextMiddleware: Account=devstoreaccount1 Container= Blob=
2024-12-12T19:23:26.908Z e639ab74-1c14-41ac-97ca-86d1587c49f1 verbose: DispatchMiddleware: Dispatching request...
2024-12-12T19:23:26.909Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: DispatchMiddleware: Operation=Service_FilterBlobs
2024-12-12T19:23:26.910Z e639ab74-1c14-41ac-97ca-86d1587c49f1 verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications.
2024-12-12T19:23:26.910Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: PublicAccessAuthenticator:validate() Start validation against public access.
2024-12-12T19:23:26.910Z e639ab74-1c14-41ac-97ca-86d1587c49f1 debug: PublicAccessAuthenticator:validate() Getting account properties...
2024-12-12T19:23:26.910Z e639ab74-1c14-41ac-97ca-86d1587c49f1 debug: PublicAccessAuthenticator:validate() Retrieved account name from context: devstoreaccount1, container: , blob:
2024-12-12T19:23:26.912Z e639ab74-1c14-41ac-97ca-86d1587c49f1 debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container
2024-12-12T19:23:26.913Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication.
2024-12-12T19:23:26.914Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"GET\n\n\n\n\n\n\n\n\n\n\n\nx-ms-client-request-id:d132709d-3f10-4efe-86f5-012ada151d3c\nx-ms-date:Thu, 12 Dec 2024 19:23:26 GMT\nx-ms-return-client-request-id:true\nx-ms-version:2021-12-02\n/devstoreaccount1/devstoreaccount1/\ncomp:blobs\nmaxresults:100\nwhere:@container='mycontainer' AND MyTag >= 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/' AND MyTag < 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/zzzzzzzzzzzzzzzzzzzzzzz'"
2024-12-12T19:23:26.915Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey devstoreaccount1:19ZJUVfGgitMomyBXGP8nQJ6+hyxu+SLqhHuhNoTijs=
2024-12-12T19:23:26.915Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobSharedKeyAuthenticator:validate() Signature 1 matched.
2024-12-12T19:23:26.915Z e639ab74-1c14-41ac-97ca-86d1587c49f1 verbose: DeserializerMiddleware: Start deserializing...
2024-12-12T19:23:26.916Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: HandlerMiddleware: DeserializedParameters={"options":{"where":"@container='mycontainer' AND MyTag >= 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/' AND MyTag < 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/zzzzzzzzzzzzzzzzzzzzzzz'","maxresults":100,"include":[],"requestId":"d132709d-3f10-4efe-86f5-012ada151d3c"},"comp":"blobs","version":"2021-12-02"}
2024-12-12T19:23:26.917Z e639ab74-1c14-41ac-97ca-86d1587c49f1 error: ErrorMiddleware: Received an error, fill error information to HTTP response
2024-12-12T19:23:26.918Z e639ab74-1c14-41ac-97ca-86d1587c49f1 error: ErrorMiddleware: ErrorName=Error ErrorMessage=can't have multiple conditions for a single tag unless they define a range ErrorStack="Error: can't have multiple conditions for a single tag unless they define a range\n    at QueryParser.validateWithPreviousComparison (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:81:27)\n    at QueryParser.visitBinary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:245:26)\n    at QueryParser.visitExpressionGroup (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:208:25)\n    at QueryParser.visitUnary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:189:28)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:170:27)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:173:32)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:173:32)\n    at QueryParser.visitOr (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:149:27)\n    at QueryParser.visitExpression (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:139:21)\n    at QueryParser.visitQuery (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:126:27)"
2024-12-12T19:23:26.918Z e639ab74-1c14-41ac-97ca-86d1587c49f1 error: ErrorMiddleware: Set HTTP code: 500
2024-12-12T19:23:26.918Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: EndMiddleware: End response. TotalTimeInMS=11 StatusCode=500 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.33.0"}

aaronenberg-msft avatar Dec 12 '24 19:12 aaronenberg-msft

@EmmaZhu Would you please help to look at the blob tag issue?

blueww avatar Dec 16 '24 03:12 blueww

I am using Azurite 3.33.0 in my Python project, but I am encountering issues when attempting to query blobs with multiple greater than / smaller than conditions on the same tag, which does work on the actual blob storage. Specifically, the query fails with an HttpResponseError (Internal Server Error) mentioning the errorMessage can't have multiple conditions for a single tag unless they define a range, which should work in this instance.

The following query works fine for filtering by a single tag:

blob_service_client = BlobServiceClient(
    account_url=azure_blob_storage_endpoint, credential=credentials
)

container_client = blob_service_client.get_container_client(
    container=container_name
)

start_year = 2012

query = f"\"year\">='{start_year}'"
next(container_client.find_blobs_by_tags(filter_expression=query))["name"]

However, when I attempt to combine multiple conditions using AND, like this:

start_year = 2012
end_year = 2022

query = (
    f"\"year\">='{start_year}' AND \"year\"<='{end_year}'"
)
next(container_client.find_blobs_by_tags(filter_expression=query))["name"]

I get the following error: azure.core.exceptions.HttpResponseError: Internal Server Error ErrorCode: None

The debug shows:

2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 info: HandlerMiddleware: DeserializedParameters={"options":{"where":"\"year\">='2012' AND \"year\"<='2022'","include":[],"requestId":"65c52b18-d417-11ef-84c3-c84bd64ac0da"},"restype":"container","comp":"blobs","version":"2025-01-05"}
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: Received an error, fill error information to HTTP response
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: ErrorName=Error ErrorMessage=can't have multiple conditions for a single tag unless they define a range ErrorStack="Error: can't have multiple conditions for a single tag unless they define a range\n    at QueryParser.validateWithPreviousComparison (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:81:27)\n    at QueryParser.visitBinary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:249:26)\n    at QueryParser.visitExpressionGroup (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:208:25)\n    at QueryParser.visitUnary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:189:28)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:170:27)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:173:32)\n    at QueryParser.visitOr (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:149:27)\n    at QueryParser.visitExpression (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:139:21)\n    at QueryParser.visitQuery (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:126:27)\n    at QueryParser.visit (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:118:21)"
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: Set HTTP code: 500
2025-01-16T14:37:22.861Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 info: EndMiddleware: End response. TotalTimeInMS=5 StatusCode=500 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.33.0"}

tobiasxg avatar Jan 13 '25 16:01 tobiasxg

@tobiasxg

Would you please share the Azurite debug log for this success and failed request? (run Azurite with parameter like "-d c:\temp\debug.log")

@EmmaZhu Would you please help to look at the tag filter issue?

blueww avatar Jan 14 '25 02:01 blueww

@blueww

I can get these docker logs:

172.17.0.1 - - [14/Jan/2025:14:09:59 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27valueA%27 HTTP/1.1" 200 -
172.17.0.1 - - [14/Jan/2025:14:10:54 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27ValueA%27%20AND%20%22custom_tag%22%3D%27valueB%27 HTTP/1.1" 500 -
172.17.0.1 - - [14/Jan/2025:14:11:11 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27ValueA%27%20AND%20%22custom_tag%22%3D%27valueB%27 HTTP/1.1" 500 -

This is not debug log. To generate Azurite debug log, you need run Azurite with "-d [debugLogPath]" parameter. As you run Azurite in docker, you need start Azurite in docker with :

  1. "-v c:/azurite:/workspace" map host machine folder c:/azurite as Azurite's workspace location. (you can use other host machine folder )
  2. "-d /workspace/debug.log", generate debug log in workspace location, then you will find it in host path c:/azurite/debug.log

Following is the sample commandline:

docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 -v c:/azurite:/workspace mcr.microsoft.com/azure-storage/azurite azurite --blobHost 0.0.0.0 --queueHost 0.0.0.0 --tableHost 0.0.0.0 -d /workspace/debug.log

blueww avatar Jan 15 '25 03:01 blueww

@blueww I can get these docker logs:

172.17.0.1 - - [14/Jan/2025:14:09:59 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27valueA%27 HTTP/1.1" 200 -
172.17.0.1 - - [14/Jan/2025:14:10:54 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27ValueA%27%20AND%20%22custom_tag%22%3D%27valueB%27 HTTP/1.1" 500 -
172.17.0.1 - - [14/Jan/2025:14:11:11 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27ValueA%27%20AND%20%22custom_tag%22%3D%27valueB%27 HTTP/1.1" 500 -

This is not debug log. To generate Azurite debug log, you need run Azurite with "-d [debugLogPath]" parameter. As you run Azurite in docker, you need start Azurite in docker with :

1. "-v c:/azurite:/workspace" map host machine folder c:/azurite as Azurite's workspace location. (you can use other host machine folder )

2. "-d /workspace/debug.log", generate debug log in workspace location, then you will find it in host path c:/azurite/debug.log

Following is the sample commandline:

docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 -v c:/azurite:/workspace mcr.microsoft.com/azure-storage/azurite azurite --blobHost 0.0.0.0 --queueHost 0.0.0.0 --tableHost 0.0.0.0 -d /workspace/debug.log

The logs indicate that it is caused by using multiple conditions for a single tag. However it is to find blobs in a specific range, which works on the actual blob storages.

2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 info: HandlerMiddleware: DeserializedParameters={"options":{"where":"\"year\">='2012' AND \"year\"<='2022'","include":[],"requestId":"65c52b18-d417-11ef-84c3-c84bd64ac0da"},"restype":"container","comp":"blobs","version":"2025-01-05"}
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: Received an error, fill error information to HTTP response
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: ErrorName=Error ErrorMessage=can't have multiple conditions for a single tag unless they define a range ErrorStack="Error: can't have multiple conditions for a single tag unless they define a range\n    at QueryParser.validateWithPreviousComparison (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:81:27)\n    at QueryParser.visitBinary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:249:26)\n    at QueryParser.visitExpressionGroup (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:208:25)\n    at QueryParser.visitUnary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:189:28)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:170:27)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:173:32)\n    at QueryParser.visitOr (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:149:27)\n    at QueryParser.visitExpression (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:139:21)\n    at QueryParser.visitQuery (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:126:27)\n    at QueryParser.visit (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:118:21)"
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: Set HTTP code: 500
2025-01-16T14:37:22.861Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 info: EndMiddleware: End response. TotalTimeInMS=5 StatusCode=500 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.33.0"}

tobiasxg avatar Jan 16 '25 14:01 tobiasxg

Hi @aaronenberg-msft , I can reproduce the issue and will fix it later.

EmmaZhu avatar Jan 21 '25 02:01 EmmaZhu

I created a PR that fixes the issue. Could you have a look please?

rgaleev avatar Mar 07 '25 01:03 rgaleev

This is still an issue 3.34.0, anytime line on when the next release will be?

gorillapower avatar Jul 24 '25 03:07 gorillapower

Hi @gorillapower , we'll have a release in next week.

EmmaZhu avatar Jul 24 '25 05:07 EmmaZhu