Can we trim the storage scheme after parsing?
One question I have is whether we really need to keep the original abfss scheme. I don't recall any situation where we need to use it, perhaps only when creating a table? I wonder if we could simply remove it after parsing the input path.
Originally posted by @Xuanwo in https://github.com/apache/iceberg-rust/issues/1368#issuecomment-2944696445
Right now, it's only used to validate the passed path's scheme against the scheme of the endpoint that might have been used to configure the FileIO.
We don't really distinguish between wasb[s] and abfs[s], only between their TLS and non-TLS versions.
If we say that a user should also be able use a TLS endpoint via abfss:// even though they might have passed a plain text http://account.dfs.core.windows.net to the FileIO, then we can drop it!
I see a little bit of value in forcing all requests to use TLS even if some path may specify the plain text variant. Especially when users use SAS token-based auth.
I see a little bit of value in forcing all requests to use TLS even if some path may specify the plain text variant. Especially when users use SAS token-based auth.
Aha 😆 , the biggest blocker are from users of minio and azurite.
the biggest blocker are from users of minio and azurite
Yeah, as far as I can see, that's the main value of allowing plain text at all. But IIUC neither Azurite, nor Minio support the ADLS APIs 🤔
Browsing the Azure console, I can only force TLS, but not configure the storage account to only allow plain text. So local testing maybe is the only reason to support plain text.
I guess it still makes sense to support it in case someone builds their own local emulation...
One question I have is whether we really need to keep the original abfss scheme
Wouldn't we need to keep this to support existing tables that have files with this scheme?
I think it's valuable to keep it to validate against input, for example when the FileIO was created with hdfs, while the user trying to create an "s3" file with it. It may not be a problem if we have finished https://github.com/apache/iceberg-rust/issues/1314