Delta sharing should have a proper URI
The current delta sharing share file is an awkward and unfamiliar way to precisely address a remote data table. The share file creates friction as we want to scale up the universality of Delta Sharing. Look at http:, sftp: and s3: as examples.
Let us develop a standardized URI for Delta Sharing The standardized URI can then be dropped into common communication channels (email, messaging, web pages) and indexed by search engines and knowledge graphs.
Current problems:
- The share file caries a secret in clear text which is a common security issue and the file should be encrypted upon saving.
- The application programmer has to first store and distribute the file, the current APIs don't make it easy to store the share file in a secret store. The typical example starts with
client = delta_sharing.SharingClient("file:///path/to/my/share/file") - After discovering the table name, the full URI looks like:
file:///path/to/my/share/file#<share-name>.<schema-name>.<table-name>. Of course, for each user, the URI isn't really universal because the file is local. - The unique shared table identifier can't be easily transmitted in a standard form to collaborators
Instead, offer up a standardized URI interface to address delta sharing resources, something like:
delta://token:<bearertokenvalue>@<endpoint host>:port/path/sharename/schema/table[/part_key=part_value]
delta://<username>:<tokenvalue>@<endpoint host>:port/path/sharename/schema/table
delta://<endpoint host>:port/<pathinfo>/<sharename>/<schema>/<table>
For public data sets then one can drop URIs into web pages:
<a href="delta://<endpoint host>:port/path/sharename/schema/table>covid data 2020</a>
The vision is to integrate delta sharing protocol into a broad range of existing clients leveraging current URI handling subroutines. Enable sharing terabytes of data as easy as copy paste into slack.
Thanks @dmoore247, The idea is interesting.
A couple questions not sure if you already thought about it:
- Do we need to host a server for the common URI?
- I assume yes? Does the server handle the token authentication?
- And does the server identify where the provider table is located?