trino icon indicating copy to clipboard operation
trino copied to clipboard

Use BigQuery storage read API when reading external reading BigLake tables

Open anoopj opened this issue 11 months ago • 11 comments

Description

BigQuery storage APIs support reading BigLake external tables (ie external tables with a connection). But the current implementation uses views which can be expensive, because it requires Trino issuing a SQL query against BigQuery. This PR adds support to read BigLake tables directly using the storage API.

There are no behavior changes for external tables and BQ native tables - they use the view and storage APIs respectively. Added a new test for BigLake tables.

Additional context and related issues

Fixes https://github.com/trinodb/trino/issues/21016 https://cloud.google.com/bigquery/docs/biglake-intro

Release notes

(x) Release notes are required, with the following suggested text:

# BigQuery
* Improve performance when reading external BigLake tables. ({issue}`21016`)

anoopj avatar Mar 11 '24 21:03 anoopj

/test-with-secrets sha=9d7bd1dcad92de70856b928a99e443ec3d8b4619

hashhar avatar Mar 13 '24 08:03 hashhar

The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/8261925064

github-actions[bot] avatar Mar 13 '24 09:03 github-actions[bot]

Support for reading BigLake tables using BigQuery storage read API.

Please remove the following dot. https://github.com/trinodb/trino/blob/master/.github/DEVELOPMENT.md#format-git-commit-messages

Also, I would change to Use BigQuery storage read API when reading external reading BigLake tables because the current title looks little misleading. Reading BigLake tables has been supported via query API.

ebyhr avatar Mar 14 '24 01:03 ebyhr

Support for reading BigLake tables using BigQuery storage read API.

Please remove the following dot. https://github.com/trinodb/trino/blob/master/.github/DEVELOPMENT.md#format-git-commit-messages

Also, I would change to Use BigQuery storage read API when reading external reading BigLake tables because the current title looks little misleading. Reading BigLake tables has been supported via query API.

Done.

anoopj avatar Mar 14 '24 03:03 anoopj

@ebyhr Do you have any more feedback or can this be merged?

anoopj avatar Mar 19 '24 17:03 anoopj

@ebyhr @hashhar Friendly ping here. We have a GCP customer who is waiting for this PR to be merged.

anoopj avatar Mar 22 '24 17:03 anoopj

see https://github.com/trinodb/trino/pull/21017#discussion_r1536270950, I think it's an important question.

hashhar avatar Mar 26 '24 09:03 hashhar

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar May 23 '24 17:05 github-actions[bot]

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

github-actions[bot] avatar Jun 13 '24 17:06 github-actions[bot]

Can we get this merged?

velascoluis avatar Jun 26 '24 09:06 velascoluis

@anoopj Do you plan to continue this? Or should someone else pick this up and drive to completion? I see that the newer client is released already.

hashhar avatar Jun 26 '24 10:06 hashhar

@hashhar I am not planning to work on this.

anoopj avatar Jul 03 '24 01:07 anoopj

@ssheikin and @hashhar .. are you taking this over here or in a new PR? Should we close this one?

mosabua avatar Jul 04 '24 18:07 mosabua

@ssheikin and @hashhar .. are you taking this over here or in a new PR? Should we close this one?

please leave it open for now. we are discussing.

k-haley1 avatar Jul 11 '24 17:07 k-haley1

@mosabua I could take up the work on this PR

Praveen2112 avatar Jul 16 '24 16:07 Praveen2112

Sounds good @Praveen2112 .. since @anoopj is not going to continue you can continue on this PR or start a new one with his work. Just link to this PR if you create a new one.

mosabua avatar Jul 16 '24 16:07 mosabua

Continuation of this PR - https://github.com/trinodb/trino/pull/22974

marcinsbd avatar Sep 10 '24 11:09 marcinsbd