iceberg-rust icon indicating copy to clipboard operation
iceberg-rust copied to clipboard

feat: support azure blob storage

Open wcy-fdu opened this issue 8 months ago • 3 comments

Which issue does this PR close?

  • Closes #.

What changes are included in this PR?

This PR is similar to the previous one that supported GCS as storage, and it adds support for Azure Blob as storage. The authentication here uses account_name, account_key, and endpoint URL. Connectivity and correctness have already been verified on the RisingWave side, allowing for read and write access to Azure Blob.

Are these changes tested?

wcy-fdu avatar Apr 24 '25 05:04 wcy-fdu

Azblob and azdls are different storage services, but most Iceberg implementations seem to treat them as the same.

CC @Fokko, what are your thoughts? Would it be better to add native azblob support, or should we just add azdls?

Xuanwo avatar Apr 27 '25 07:04 Xuanwo

I think which blob storage to use in Azure should be a choice for the folks deploying the warehouse and not something that needs to be decided by iceberg sdks -- in other words, why not both? But azdls is definitely recommended for this kind of workload.

corleyma avatar Apr 28 '25 22:04 corleyma

I think which blob storage to use in Azure should be a choice for the folks deploying the warehouse and not something that needs to be decided by iceberg sdks -- in other words, why not both? But azdls is definitely recommended for this kind of workload.

Hi, I agree with your comments. The issue I'm trying to resolve is which service's API specifications we're using: azblob and azdls.

From the java's code, seems we should talk with azure with azdls instead:

https://github.com/apache/iceberg/blob/829ae7a11dc1eb62246c801ce1c7e501356c5463/azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java#L39C1-L44C29

 * For compatibility, locations using the wasb scheme are also accepted but will use the Azure Data
 * Lake Storage Gen2 REST APIs instead of the Blob Storage REST APIs.
 *
 * <p>See <a
 * href="https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction-abfs-uri#uri-syntax">Azure
 * Data Lake Storage URI</a>

Xuanwo avatar Apr 29 '25 03:04 Xuanwo

Hi, we are also working with Iceberg and Azure and we can't really use this as the only scheme our current supported catalogs handle are wasb or abfs.

christophediprima avatar May 12 '25 09:05 christophediprima

@wcy-fdu what do you think? Is that doable for you? Does this align with the priorities you have at RisingWave?

christophediprima avatar May 16 '25 08:05 christophediprima