latitude
latitude copied to clipboard
Write to parquet file a postgresql query
Describe your changes
We want to allow users to cache queries more permanently. We're doing persistent into parquet files. In this PR we introduce the functionality for writing parquet files to the Source Manager. We also introduce the concept of batched queries in our connectors. This means now they can pull all the data from the query in a way that doesn't exhaust the running machine's memory. This can happen with huge queries
TODO
- [x] Make sure the connector fails if it has not implemented
batchQuery
method - [x] Implement batchQuery method in postgresql connector.
- [x] Infer query schema. This is needed to build the parquet file.
- [x] Write each batch of rows from the query to a parquet file
- [x] Find a lib that works to write parquetjs 💀. Finally, I picked @dsnp/parquetjs