python-bigquery-pandas icon indicating copy to clipboard operation
python-bigquery-pandas copied to clipboard

Support JSON data type

Open tilschuenemann opened this issue 2 years ago • 3 comments

First of all: Thank you for maintaining this project, it really is a nice addition to pandas!

Is your feature request related to a problem? Please describe. GBQ supports the JSON data type, but unfortunately pandas-gbq doesn't allow writing (and I believe reading) from a table that features a JSON data type column.

Describe the solution you'd like

  1. Add support for both reading and writing dataframes that feature JSON columns.
  2. Add support for manually specifying JSON columns in the table schema.

Describe alternatives you've considered Alternatives are:

  • writing data as a string and then loading it as a JSON inside a SQL query to access its key-values
    • the downside here is that the json object can't be used for early query processing, introducing the conversion overhead
  • using the native google.cloud.bigquery module

tilschuenemann avatar Nov 13 '23 10:11 tilschuenemann

Is there any update on this?

subramad avatar Apr 24 '24 04:04 subramad

@chelsea-lin is working on a JSON dtype for https://github.com/googleapis/python-db-dtypes-pandas which is blocking this for read_gbq support. Otherwise, we'd end up returning strings (see: https://github.com/googleapis/python-bigquery/pull/1876#issuecomment-2025562693).

Regarding loads via to_gbq, I believe BigQuery's parquet load jobs don't support JSON columns in the table either, so we'll need to address this as well.

tswast avatar Jul 18 '24 19:07 tswast