hyper-api-samples icon indicating copy to clipboard operation
hyper-api-samples copied to clipboard

add support for CREATE EXTERNAL TABLE

Open djouallah opened this issue 1 year ago • 4 comments

currently only temporary external tables are supported, it will be nice to remove that limitation please

djouallah avatar Jul 04 '23 04:07 djouallah

From an implementation side, what you are asking for is straightforward. In fact, it was already discussed a couple of times internally. Unfortunately, there is more than just implementation to this feature. In particular, usability and security pose challenges.

The main issues currently are:

  1. If you move a .hyper file around, how does Hyper locate the external files? Through paths relative to the Hyper file? Absolute file paths?
  2. Should there be a way to package external files together with the .hyper file? E.g., when uploading it to Tableau Cloud.
  3. If you send a Hyper file via email, and some other person opens it, should Hyper read whichever external files are specified in the .hyper file? What if someone maliciously added an external table which reads /etc/passwd as an external CSV or some other sensitive data?
  4. What if you upload a Hyper file to Tableau Cloud? Should that file be allowed to instruct Hyper to read /etc/password and display it as part of some visualization?

While the answer for /etc/password is clearly a "no, this should not be allowed", it's hard to draw the line here

vogelsgesang avatar Jul 04 '23 07:07 vogelsgesang

1- absolute Path 2- no, that fail the purpose of an external table, the data has to be in a shared storage 3- that's not Hyper fault if someone store sensitive data without encryption, moreover only the user can see it anyway, but I am not a security expert 4- an option in tableau cloud to block reading from internal data

my use case is reading parquet files from remote storage, which I think is a very common pattern those days with lakehouse and stuff :)

thanks a lot for your reply.

djouallah avatar Jul 04 '23 08:07 djouallah

Agree with @djouallah, permanent external table (and views) is missing in Hyper. To answer your questions : 1 : both ( relative and absolute) 2 : external mean external so no packing external data into the hyper file 3 - limit external file extensions to csv, and parquet 4 - limit external file extensions should manage the problem. Limit the number of files in globs to 1000. Limit may be also on directories ( no /etc no /usr/,no /opt, no c:\windows, c:\program files....)

Do this security concerned cannot be blocked also by à security tool (edr or antivirus) on the Tableau cloud clusters ?

Object storage is great but there can be also fast parallel remote filesystems like pnfs or lustre that also provide excellent performance to access remote data....

rferraton avatar Jul 04 '23 22:07 rferraton

recently DuckDB added an option to turn off reading from a local filesystem, I guess you guys can do the same for Tableau cloud, turn it off by default for security reason.

djouallah avatar Jul 20 '23 02:07 djouallah