go-sqlcmd
go-sqlcmd copied to clipboard
Add ability to import/export data to/from Parquet files with SQLCMD
Creating this placeholder for the investigation of adding import/export functionality to go-sqlcmd.
Main use cases would include:
- Output SQL Server query results to parquet files
- Read data from parquet files for display of some or all rows on screen to check contents of file
- Read data from parquet files and insert into table in SQL Server database
- Be able to read both standalone parquet files as well as delta
- Perform basic filtering via SQL-like where clause or regex by column
Additional possibilities include:
- Output SQL Server query results to json, csv, xls files
- Convert json, csv, xls to parquet
- Convert parquet, including delta, to json, csv, xls
- Convert subset of rows between formats - ex: "Take customer data from delta store and export all customers in Chicago to xls"
You can do this easily with sling, check out: https://github.com/slingdata-io/sling-cli
# set connection via env var
export mssql='sqlserver://...'
# test connection
sling conns test mssql
# run export for many tables
sling run --src-conn mssql --src-stream 'my_schema.*' --tgt-object 'file://{stream_schema}/{stream_table}.parquet'
# run export for one table
sling run --src-conn mssql --src-stream 'my_schema.my_table' --tgt-object 'file://my_folder/my_table.parquet'
# run export for custom sql
sling run --src-conn mssql --src-stream 'select col1, col2 from my_schema.my_table where col3 > 0' --tgt-object 'file://my_folder/my_table.parquet'