google-cloud-python icon indicating copy to clipboard operation
google-cloud-python copied to clipboard

deprecate `ReadRowsStream.to_dataframe`, `ReadRowsIterable.to_dataframe`

Open tswast opened this issue 4 years ago • 4 comments

These higher-level functions seemed useful at the time of implementation, but they were not needed by google-cloud-bigquery, pandas-gbq, or even experimental Dask (https://github.com/dask/dask/issues/3121#issuecomment-578328715). Instead, only the lower-level ReadRowsPage.to_arrow and ReadRowsPage.to_dataframe methods were used.

Deprecating these can allow us to deprecate the passing around of read_session fully because we won't have to worry about returning something sensible for empty streams.

tswast avatar Jul 08 '21 19:07 tswast

Source for the fact that we only need to worry about individual message -> arrow/dataframe:

https://github.com/googleapis/python-bigquery/blob/1246da86b78b03ca1aa2c45ec71649e294cfb2f1/google/cloud/bigquery/_pandas_helpers.py#L598

https://github.com/googleapis/python-bigquery/blob/f8d4aaa335a0eef915e73596fc9b43b11d11be9f/google/cloud/bigquery/_pandas_helpers.py#L752

https://github.com/googleapis/python-bigquery/blob/f8d4aaa335a0eef915e73596fc9b43b11d11be9f/google/cloud/bigquery/_pandas_helpers.py#L581

tswast avatar Jul 08 '21 19:07 tswast

Turns out there are some use cases for ReadRowsStream.to_arrow: https://github.com/vaexio/vaex/blob/f1335d20a6f0a52259c368fc0a7fef3cd4919f8f/packages/vaex-contrib/vaex/contrib/io/gbq.py#L112

tswast avatar Sep 21 '21 16:09 tswast

googleapis/python-bigquery-storage#247 has a comment about the PR being accidentally closed; I believe we can reuse those changes.

meredithslota avatar Jun 13 '23 16:06 meredithslota

This issue was transferred from python-bigquery-storage to google-cloud-python as part of the work for https://github.com/googleapis/google-cloud-python/issues/10991.

parthea avatar Aug 22 '25 11:08 parthea