python-bigquery-pandas
python-bigquery-pandas copied to clipboard
feat: [WIP] add dry_run option for read_gbq
The basic logic is:
- If the input is a query, dry run it
- If the input is a table reference, convert it to a SQL (
"SELECT * FROM ...") and dry run it.
The dry run stats are returned in a Pandas Series, similar with https://github.com/googleapis/python-bigquery-dataframes/blob/30a62372b2fd72deaf7dbc1dbce8b48f03b3041c/bigframes/core/blocks.py#L818-L885
I didn't perform the type conversion from BQ schema to Pandas dtypes due to non-trivial efforts. We could either:
- Re-use BQ Python client's conversion function, which is buried somewhere deep in the codebase, and is possibly not supposed to be invoked by external packages.
- Define our own conversion function, which needs to be kept in sync with the BQ client library.
The next step is add logics of type conversions from BQ schema to Pandas
I will close this PR as we are migrating the repo into googleapis/google-cloud-python. Please open a PR there if you want to continue working on it.