Tim Sweña (Swast) comments

Results 302 comments of


                                            Tim Sweña (Swast)

deps: update to ibis-framework 9.x and newer sqlglot

Re: `FAILED tests/system/small/test_pandas.py::test_get_dummies_dataframe[kwargs2] - AssertionError: DataFrame.iloc[:, 11] (column name="time_col.11:14:34.701606") are different` Looks like the time scalar is losing microsecond precision. ``` COALESCE(`t0`.`time_col` = time(11, 14, 34), FALSE) AS `col_15`, COALESCE(`t0`.`time_col`...

deps: update to ibis-framework 9.x and newer sqlglot

Marking as `do not merge` because to fix 3.9, we'll need to vendor ibis 9.x into third_party.

feat!: convert arrow data types to Python types in `rows()`

Sweet. I had forgotten https://github.com/googleapis/python-bigquery/blob/c9068e4191dbe3632fe399a0b777e8bc54a183a6/google/cloud/bigquery/dbapi/_helpers.py#L468-L470 Thanks for investigating! It does seem like we should change this in the BQ Storage client, but this will be good to keep in mind...

bug: BigQuery backend table.to_pyarrow_batches is not honoring the chunk_size parameter

This has to be done via the `page_size` parameter on [`QueryJob.result`](https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.job.QueryJob#google_cloud_bigquery_job_QueryJob_result) or [`query_and_wait`](https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.client.Client#google_cloud_bigquery_client_Client_query_and_wait). As far as I can tell we're doing this correctly in Ibis. https://github.com/ibis-project/ibis/blob/9f565a9ad98a089fcb25959a88136a6e7bc1c506/ibis/backends/bigquery/__init__.py#L802 https://github.com/ibis-project/ibis/blob/9f565a9ad98a089fcb25959a88136a6e7bc1c506/ibis/backends/bigquery/__init__.py#L680 A few things...

bug: BigQuery backend table.to_pyarrow_batches is not honoring the chunk_size parameter

Would it make sense for Ibis to do some client-size grouping to respect this parameter? I've viewed page size / chunk size as more of a tuning parameter, so it...

show deprecation warning when `read_session` passed to `ReadRowsStream.rows`, `ReadRowsStream.to_arrow`, `ReadRowsStream.to_dataframe`

Blocked by https://github.com/googleapis/python-bigquery/issues/742

deprecate `ReadRowsStream.to_dataframe`, `ReadRowsIterable.to_dataframe`

Source for the fact that we only need to worry about individual message -> arrow/dataframe: https://github.com/googleapis/python-bigquery/blob/1246da86b78b03ca1aa2c45ec71649e294cfb2f1/google/cloud/bigquery/_pandas_helpers.py#L598 https://github.com/googleapis/python-bigquery/blob/f8d4aaa335a0eef915e73596fc9b43b11d11be9f/google/cloud/bigquery/_pandas_helpers.py#L752 https://github.com/googleapis/python-bigquery/blob/f8d4aaa335a0eef915e73596fc9b43b11d11be9f/google/cloud/bigquery/_pandas_helpers.py#L581

Tim Sweña (Swast)

deps: update to ibis-framework 9.x and newer sqlglot

deps: update to ibis-framework 9.x and newer sqlglot

feat!: convert arrow data types to Python types in `rows()`

bug: BigQuery backend table.to_pyarrow_batches is not honoring the chunk_size parameter

bug: BigQuery backend table.to_pyarrow_batches is not honoring the chunk_size parameter

show deprecation warning when `read_session` passed to `ReadRowsStream.rows`, `ReadRowsStream.to_arrow`, `ReadRowsStream.to_dataframe`

deprecate `ReadRowsStream.to_dataframe`, `ReadRowsIterable.to_dataframe`

deprecate `ReadRowsStream.to_dataframe`, `ReadRowsIterable.to_dataframe`

Multiple stream example needs clarify

Timestamp returns a different type depending on whether the data is empty or not.