dask-sql
dask-sql copied to clipboard
[ENH] Add support for writing query results to disk
I'd like to do some data manipulation and persist results to storage.
In Hive and Spark, you can use the CREATE EXTERNAL TABLE ...
syntax which allows specifying, for example, the storage location and file format.
Example:
CREATE EXTERNAL TABLE IF NOT EXISTS new_table_on_disk
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/raid/new_table_on_disk/ AS
SELECT * FROM df_final_View
The above is a somewhat verbose way of specifying CSV arguments, but we could simplify with the keyword =
way our inputs work:
CREATE EXTERNAL TABLE my_data WITH (
format = 'parquet',
location = 'raid/new_table_on_disk'
)