duckdb_azure
duckdb_azure copied to clipboard
Unable to query multiple files on Azure using container level sas token
What happens?
When querying for multiple files az://
duckdb.duckdb.IOException: IO Error: AzureStorageFileSystem Read to az://<account>.blob.core.windows.net/<container>/path/to/blobs/*.json failed with NoAuthenticationInformation Reason Phrase: Server failed to authenticate the request. Please refer to the information in the www-authenticate header.
This exception is not raised when pointing to a specific blob (see example)
The SAS token is created using the following guide: https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/how-to-guides/create-sas-tokens?tabs=Containers
and all permissions are clicked (read/write/list/etc)
To Reproduce
Method 1
import duckdb
from adlfs.spec import AzureBlobFileSystem
fs = AzureBlobFileSystem(
account_name='',
container_name='', # tried with and without this param
sas_token='mySasToken',
)
print(fs.glob("<container_name>/")). # works
print(fs.ls("<container_name>/")). # works
connection = duckdb.connect()
connection.register_filesystem(fs)
data = connection.sql("""
SELECT *
FROM read_json('az://<account_name>.blob.core.windows.net/<container>/path/to/specificFile.json');
""")# works
data = connection.sql("""
SELECT *
FROM read_json('az://<account_name>.blob.core.windows.net/<container>/path/to/multiple/files/*.json');
""") # raises IOException
Method 2
import duckdb
duckdb.execute("""
INSTALL azure;
LOAD azure;
""")
duckdb.execute("""
CREATE SECRET secret1 (
TYPE AZURE,
CONNECTION_STRING 'mySasToken'
);
""")
connection = duckdb.connect()
data = connection.sql("""
SELECT *
FROM read_json('az://<account_name>.blob.core.windows.net/<container>/path/to/specificFile.json');
""")
``` # works
data = connection.sql("""
SELECT *
FROM read_json('az://<account_name>.blob.core.windows.net/<container>/path/to/multiple/files/*.json');
""")
``` # raises IOException
OS:
arm64 (Apple M1)
DuckDB Version:
0.10.2
DuckDB Client:
Python
Full Name:
Erik Farmer
Affiliation:
PepsiCo
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Not applicable - the reproduction does not require a data set
Did you include all code required to reproduce the issue?
- [X] Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
- [X] Yes, I have