read_gbq: add mechanism to ensure BQ storage API usage
Assumption: there is currently no supported way to force read_gbq to use the BQ storage API. I'd be happy to be corrected if I missed something!
Is your feature request related to a problem? Please describe.
I have cases where read_gbq's heuristic chooses the JSON API, when I want the storage API. This is most noticeable for me on medium sized tables, which might take 5-20 seconds to load via the JSON API (comparatively these were much faster via the storage API). For many of my use cases: to make interactive use cases more bearable, I am very willing to pay the additional storage API cost.
Describe the solution you'd like
A parameter to read_gbq which forces the usage of the BQ storage API (including raising an error if the necessary deps are not available to do so). I won't try to be prescriptive about the details, though I'll note that the desired behavior I've described is what I expected from use_bqstorage_api, based on the name. From my understanding of the current behavior, allow_bqstorage_api is maybe more accurate.
I think you are correct. Basically, we are now calling query_and_wait, which might not create a destination table that we can read from with the BQ Storage API. Such a flag would have to force the use of query from google-cloud-bigquery.
It sounds like we need three values for use_bqstorage_api:
True: always use (usequery)False: never use (currently works, I believe) (usequery_and_waitand disable bq storage client creation)"default"(orNone): choose based on the heuristics inquery_and_wait