cube icon indicating copy to clipboard operation
cube copied to clipboard

Use the Trino client instead of inheriting from Presto

Open ndrluis opened this issue 1 year ago • 4 comments

Hello, I'm opening this PR, but I don't have expertise in TypeScript and JavaScript. I've tried my best to solve the problem, but I need some help to make the test suite work.

This PR aims to decouple the Trino driver from Presto, which uses an outdated client that is quite slow. I ran some tests comparing only the clients, and the Presto driver currently used by Cube (which is not the official one) takes about 3 seconds to execute a SELECT 1 query, while the Trino driver completes it in 300ms.

Additionally, there are features in the Presto driver that are not supported by Trino, such as unloading data through GCS. Therefore, we need to isolate each of these drivers.

Check List

  • [ ] Tests has been run in packages where changes made if available
  • [ ] Linter has been run for changed code
  • [ ] Tests for the changes have been added if not covered yet
  • [ ] Docs have been added / updated if required

ndrluis avatar Nov 12 '24 18:11 ndrluis

The latest updates on your projects. Learn more about Vercel for Git ↗︎

8 Skipped Deployments
Name Status Preview Comments Updated (UTC)
examples-angular-dashboard ⬜️ Ignored (Inspect) Visit Preview Nov 13, 2024 5:17pm
examples-react-d3 ⬜️ Ignored (Inspect) Visit Preview Nov 13, 2024 5:17pm
examples-react-dashboard ⬜️ Ignored (Inspect) Visit Preview Nov 13, 2024 5:17pm
examples-react-data-table ⬜️ Ignored (Inspect) Visit Preview Nov 13, 2024 5:17pm
examples-react-highcharts ⬜️ Ignored (Inspect) Visit Preview Nov 13, 2024 5:17pm
examples-react-material-ui ⬜️ Ignored (Inspect) Visit Preview Nov 13, 2024 5:17pm
examples-react-pivot-table ⬜️ Ignored (Inspect) Visit Preview Nov 13, 2024 5:17pm
examples-vue-query-builder ⬜️ Ignored (Inspect) Visit Preview Nov 13, 2024 5:17pm

vercel[bot] avatar Nov 12 '24 18:11 vercel[bot]

Hi @ndrluis 👋 I think this change was long awaited for, thanks for the contribution!

For the docs, if you could update this page with new env vars and check for other instances of "Trino", that would be fantastic: https://cube.dev/docs/reference/configuration/environment-variables

igorlukanin avatar Nov 13 '24 13:11 igorlukanin

@ndrluis Hey Andre! Thanks for contributing! I feel it's right direction however in order to make this switch Trino should meet parity in terms of features. At the first sight at least unload support is missing. We need to have in order to land this change.

paveltiunov avatar Nov 13 '24 17:11 paveltiunov

@paveltiunov I understand the Presto implementation, but as a Trino user, I believe it makes more sense to wait for the spooled client protocol, which solves this problem for any catalog. The current solution only works for GCS and, as far as I know, only for the Hive catalog.

I fixed the query method and removed the incorrect stream method, but I'm unsure of the best way to implement and test it, since the driver works with just the query method. Which cube feature uses the stream method?

ndrluis avatar Nov 13 '24 18:11 ndrluis