python-bigquery-pandas
python-bigquery-pandas copied to clipboard
Support JSON data type
First of all: Thank you for maintaining this project, it really is a nice addition to pandas!
Is your feature request related to a problem? Please describe. GBQ supports the JSON data type, but unfortunately pandas-gbq doesn't allow writing (and I believe reading) from a table that features a JSON data type column.
Describe the solution you'd like
- Add support for both reading and writing dataframes that feature JSON columns.
- Add support for manually specifying JSON columns in the table schema.
Describe alternatives you've considered Alternatives are:
- writing data as a string and then loading it as a JSON inside a SQL query to access its key-values
- the downside here is that the json object can't be used for early query processing, introducing the conversion overhead
- using the native google.cloud.bigquery module
Is there any update on this?
@chelsea-lin is working on a JSON dtype for https://github.com/googleapis/python-db-dtypes-pandas which is blocking this for read_gbq support. Otherwise, we'd end up returning strings (see: https://github.com/googleapis/python-bigquery/pull/1876#issuecomment-2025562693).
Regarding loads via to_gbq, I believe BigQuery's parquet load jobs don't support JSON columns in the table either, so we'll need to address this as well.