databricks-sql-go icon indicating copy to clipboard operation
databricks-sql-go copied to clipboard

Critical: "context deadline exceeded" after upgrade from 1.4.0 to 1.5.3

Open gilsegment opened this issue 1 year ago • 2 comments

We started getting "context deadline exceeded" after upgrade from 1.4.0 to 1.5.3 This happens during inital connection to the warehouse. We are getting that error on multiple Databricks warehouses from different accounts.

I suspect we are getting timeout here: https://github.com/databricks/databricks-sql-go/blob/164893503c207fa6fc26e99666d54a6ebcb67d29/connection.go#L82

which uses hard coded timeout of 60s without option to modify it: https://github.com/databricks/databricks-sql-go/blob/beea4c4d35ce778a9e916ac03d463c59d422a5fb/internal/config/config.go#L191

Maybe you should use the "timeout" provided in DSN also for the ping https://github.com/databricks/databricks-sql-go/blob/beea4c4d35ce778a9e916ac03d463c59d422a5fb/internal/config/config.go#L242

gilsegment avatar Apr 03 '24 21:04 gilsegment

@gilsegment Can you please help us to narrow down the scope of the issue? 1.5.3 doesn't introduce much changes, so can you please try to gradually upgrade from 1.4.0 and check which version contains the issue? That would help us a lot. Thank you!

kravets-levko avatar Apr 17 '24 12:04 kravets-levko

Unfortunately I cannot do that. Few things I can suggest are:

  1. See which changes are relevant in the release notes from 1.4.0 to 1.5.3
  2. Respect the timeout input parameter like I recommended in my initial comment. Or introduce a new parameter just for the ping.
  3. Check if during the time I reported this, there was a backend issue with Databricks that might have caused that. I think this it less likely because we seen this happening right after the client library was upgraded. But still possible. (first query after connection takes long time -> "select 1" in our case)

gilsegment avatar Apr 17 '24 13:04 gilsegment