python-bigquery-pandas icon indicating copy to clipboard operation
python-bigquery-pandas copied to clipboard

feat: [WIP] add dry_run option for read_gbq

Open sycai opened this issue 7 months ago • 1 comments

The basic logic is:

  • If the input is a query, dry run it
  • If the input is a table reference, convert it to a SQL ("SELECT * FROM ...") and dry run it.

The dry run stats are returned in a Pandas Series, similar with https://github.com/googleapis/python-bigquery-dataframes/blob/30a62372b2fd72deaf7dbc1dbce8b48f03b3041c/bigframes/core/blocks.py#L818-L885

I didn't perform the type conversion from BQ schema to Pandas dtypes due to non-trivial efforts. We could either:

  • Re-use BQ Python client's conversion function, which is buried somewhere deep in the codebase, and is possibly not supposed to be invoked by external packages.
  • Define our own conversion function, which needs to be kept in sync with the BQ client library.

sycai avatar Apr 28 '25 22:04 sycai

The next step is add logics of type conversions from BQ schema to Pandas

sycai avatar Apr 29 '25 21:04 sycai

I will close this PR as we are migrating the repo into googleapis/google-cloud-python. Please open a PR there if you want to continue working on it.

Linchin avatar Sep 17 '25 21:09 Linchin