cube icon indicating copy to clipboard operation
cube copied to clipboard

feat(flightsql-jdbc): Add support for Arrow Flight JDBC driver (#8829)

Open vincentditlevinz opened this issue 7 months ago • 8 comments

Check List

  • [x] Tests have been run in packages where changes made if available
  • [x] Linter has been run for changed code
  • [x] Tests for the changes have been added if not covered yet
  • [x] Docs have been added / updated if required

Issue Reference this PR resolves

[#8829]

vincentditlevinz avatar Jul 17 '25 10:07 vincentditlevinz

Sorry for the level of testing, as I said I've followed packages/cubejs-databricks-jdbc-driver as a template and did not find a lot of tests inside, same with the jdbc drivers module. I might try something with testcontainer...

That said, I successfully managed to build a dev docker image and connect to a spice.ai sql engine supporting arrow flight sql jdbc driver using CUBEJS_DB_TYPE=flightsql-jdbc.

The driver download works as shown in this screenshot showing the dev container internal file system:

image

However I cannot do more than connecting to the database when clicking on the "Data model" folder.

image

I might have missed something or perhaps a dev container is not full featured ? When I try to see the generated SQL in the playground I see Error: Unknown dbType: flightsql-jdbc, and in container's logs:

image

vincentditlevinz avatar Jul 17 '25 11:07 vincentditlevinz

Hi @vincentditlevinz 👋 Thanks! I'd suggest that you can you databricks-jdbc as an example and check if there's anything else that needs to be updated: https://github.com/search?q=repo%3Acube-js%2Fcube%20databricks-jdbc&type=code

igorlukanin avatar Jul 18 '25 10:07 igorlukanin

High @igorlukanin

I managed to add some tests with testcontainer that have worked on my laptop :smile: They test testConnection() and a simple query (count(*)) against an Arrow flight SQL compatible database.

When I compared all occurrences of databrick-jdbc with my PR I found that I haven't yet:

  • written any doc.
  • written anything in cubejs-testing/birdbox-fixture, not sure I should.
  • written anything in cubejs-testing-drivers. It sounds a bit complex to me to understand what I should code there.
  • Add a line in cubejs-docker/package.json, because it won't work until we first deploy flightsql-jdbc driver in the NPM repository (same for the CLI that won't work until this moment too). As far as I understand, a new driver can only be deployed in a "latest" docker image if it has been already deployed in the NPM repository.

The last item probably explains why my new driver doesn't work well in the playground, because I must use a "dev" docker image for the moment.

Please, let me know what you expect me to do to make this PR valid according to your development standards.

vincentditlevinz avatar Jul 18 '25 12:07 vincentditlevinz

What bothers me here is that different databases require different SQL dialects to be used with them. As you can see in the code base, each (many) database requires not only a driver but also a SQL dialect implementation. I wonder how this is going to be solved with this one. I'll kindly ask @ovr or @KSDaemon provide their ideas here.

igorlukanin avatar Jul 29 '25 11:07 igorlukanin

Hi @igorlukanin ,

Thanks for your interest in this PR.

I don't know exactly. According to spice.ai faq, the database used for testing, they are based on Apache data fusion and then support Postgres dialect. Dremio support ANSI SQL standard. Apache Doris also supports the Postgres dialect among others. Finally, according to Influxdb: InfluxDB supports the PostgresQL wire protocol dialect of SQL

That's the 4 databases supporting Arrow Flight SQL Jdbc driver I am aware of. Cube.js seems to be supporting Postgres dialect too does it ? In this case we might not need to implement any specific dialect.

vincentditlevinz avatar Jul 29 '25 12:07 vincentditlevinz

Hello,

There has been no update for this PR for a long time. I need to know if you are interested or not by this addition to decide whether it is worth it or not for me to spend more time with cube.js technology exploration (for us, Apache Arrow based technology compatibility is mandatory) Thanks.

Sincerely Vincent Mathon

vincentditlevinz avatar Sep 03 '25 19:09 vincentditlevinz

@igorlukanin, @ovr, @KSDaemon ?

vincentditlevinz avatar Sep 08 '25 07:09 vincentditlevinz

Hey @vincentditlevinz 👋 Sorry for the radio silence. I would suggest publishing this as an npm package (see 1 here: https://github.com/cube-js/cube/blob/master/CONTRIBUTING.md#contributing-database-drivers) and linking it from docs. We'll be able to see if it picks up and then decide whether we need to include it into the main distribution. Also, thanks for all your hard work on this one!

igorlukanin avatar Sep 17 '25 20:09 igorlukanin