go-sqlcmd icon indicating copy to clipboard operation
go-sqlcmd copied to clipboard

Add ability to import/export data to/from Parquet files with SQLCMD

Open dlevy-msft opened this issue 1 year ago • 1 comments

Creating this placeholder for the investigation of adding import/export functionality to go-sqlcmd.

Main use cases would include:

  1. Output SQL Server query results to parquet files
  2. Read data from parquet files for display of some or all rows on screen to check contents of file
  3. Read data from parquet files and insert into table in SQL Server database
  4. Be able to read both standalone parquet files as well as delta
  5. Perform basic filtering via SQL-like where clause or regex by column

Additional possibilities include:

  1. Output SQL Server query results to json, csv, xls files
  2. Convert json, csv, xls to parquet
  3. Convert parquet, including delta, to json, csv, xls
  4. Convert subset of rows between formats - ex: "Take customer data from delta store and export all customers in Chicago to xls"

dlevy-msft avatar Dec 07 '23 15:12 dlevy-msft

You can do this easily with sling, check out: https://github.com/slingdata-io/sling-cli

# set connection via env var
export mssql='sqlserver://...'

# test connection
sling conns test mssql

# run export for many tables
sling run --src-conn mssql --src-stream 'my_schema.*' --tgt-object 'file://{stream_schema}/{stream_table}.parquet'

# run export for one table
sling run --src-conn mssql --src-stream 'my_schema.my_table' --tgt-object 'file://my_folder/my_table.parquet'

# run export for custom sql
sling run --src-conn mssql --src-stream 'select col1, col2 from my_schema.my_table where col3 > 0' --tgt-object 'file://my_folder/my_table.parquet'

flarco avatar Jul 02 '24 10:07 flarco