python-bigquery-pandas
python-bigquery-pandas copied to clipboard
add `configuration` argument to `to_gbq`
The to_gbq function should take a configuration argument representing a BigQuery JobConfiguration REST API resource.
This would make it consistent with the read_gbq function.
Context
Options for table creation / schema updates
- Partitioning and Clustering: https://github.com/googleapis/python-bigquery-pandas/issues/395
- Schema update options: https://github.com/googleapis/python-bigquery-pandas/issues/107
- Partition expiration time https://github.com/googleapis/python-bigquery-pandas/issues/313
I believe these would require table creation to be done by load job instead of a separate create table step (especially partitioning, as that must be done at creation time). TBD what this would look like if we add support for the BigQuery Storage Write API or (legacy) Streaming API.
Options for file loading
- Custom NULL marker https://github.com/googleapis/python-bigquery-pandas/issues/366 -- this would require an update to pandas CSV write configuration as, I believe, though.