ibis icon indicating copy to clipboard operation
ibis copied to clipboard

ci(bigquery): bigquery ci is very slow

Open cpcloud opened this issue 1 year ago • 3 comments

This CI run took nearly an hour and a half: https://github.com/ibis-project/ibis/actions/runs/8697121662/job/23851753722.

Is there something we can do to speed this up a bit?

cc @tswast

cpcloud avatar Apr 17 '24 11:04 cpcloud

Created a notebook in a gist showing the issue: https://gist.github.com/cpcloud/b019ed898312d422190152b02b029377.

cpcloud avatar Apr 17 '24 13:04 cpcloud

Not sure why the increase in variability. One thing we might want to try is the new query_and_wait method (added in google-cloud-bigquery 3.14.0, https://github.com/googleapis/python-bigquery/pull/1722) which is optimized for queries that return small (< 100 MB) results.

tswast avatar Apr 25 '24 20:04 tswast

optimized for queries that return small (< 100 MB) results.

Note: It falls back to the existing BQ Storage Read API implementation for larger results.

tswast avatar Apr 25 '24 20:04 tswast

Copying from an email here for easier reference.

I added Client.query_and_wait in google-cloud-bigquery 3.14.0 late last year (https://github.com/googleapis/python-bigquery/blob/main/CHANGELOG.md#3140-2023-12-08). Since then there have been a few fixes and optimizations, but I think that should be safe as the minimum version for what ibis is doing. For small queries with small results (< 500 KB or so) that can save anywhere from a few hundred milliseconds to 3 seconds.

Looking at where .query() is currently, called in ibis

https://github.com/ibis-project/ibis/blob/560ddf6ca24e0d29fdd565aa59a22f3e7a32e959/ibis/backends/bigquery/init.py#L628-L631

it won't be quite trivial. For example, the pattern of waiting until .result() to set the page size

https://github.com/ibis-project/ibis/blob/560ddf6ca24e0d29fdd565aa59a22f3e7a32e959/ibis/backends/bigquery/init.py#L810

won't work. It needs to be set at query_and_wait time.

tswast avatar Jun 20 '24 17:06 tswast

Yep, working through it now!

cpcloud avatar Jun 20 '24 17:06 cpcloud

Thanks for the reference!

cpcloud avatar Jun 20 '24 17:06 cpcloud