airbyte icon indicating copy to clipboard operation
airbyte copied to clipboard

Destination databricks: switch to oss jdbc driver

Open edgao opened this issue 1 year ago • 2 comments

closes https://github.com/airbytehq/airbyte-internal-issues/issues/9120

also it seems like we don't need the databricks sdk at all?

The new driver has a slightly different interface (you can't directly supply a URL, it forces you to supply individual fields/properties). I tried to port over our existing stuff, but removed the transportMode=http and EnableArrow=0 things to see if they're still needed.

Databricks documentation doesn't even describe how to do oauth, it only says how to do PAT (https://docs.gcp.databricks.com/en/integrations/jdbc/oss.html#authenticate-the-driver). I copied our old stuff to the new interfaces naively, but it doesn't work

  • DatabricksSQLException: Communication link failure. Failed to connect to server. :https://dbc-6aebf761-f8d6.cloud.databricks.com:443accessToken must be defined
  • DatabricksSQLException: Communication link failure. Failed to connect to server. :https://dbc-6aebf761-f8d6.cloud.databricks.com:443Cannot invoke "com.databricks.sdk.core.oauth.OpenIDConnectEndpoints.getTokenEndpoint()" because "jsonResponse" is null).

notable changes in the oss driver:

  • timestamps with timezone now have a timezone directly from the driver
  • timestamps without timezone have .000 precision
  • Inline byte limit exceeded. Statements executed with disposition=INLINE can have a result size of at most 26214400 bytes. Please execute the statement with disposition=EXTERNAL_LINKS if you want to download the full result

which means:

  • the destinationhandler can parse directly to an Instant, instead of needing to go through LocalDateTime
  • tons of changes in the expected records

edgao avatar Aug 14 '24 15:08 edgao

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Sep 4, 2024 4:56pm

vercel[bot] avatar Aug 14 '24 15:08 vercel[bot]

  • #44033 Graphite 👈
  • #44505 Graphite: 1 other dependent PR (#44506 Graphite)
  • master

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @edgao and the rest of your teammates on Graphite Graphite

edgao avatar Aug 14 '24 16:08 edgao

We are closing this PR due to inactivity. We regularly close PRs that were created more than 6 months ago. If you'd like to keep working on it, please feel free reopen it. We'd be happy to keep working with you.

cgardens avatar Aug 14 '25 17:08 cgardens