databricks-sql-python icon indicating copy to clipboard operation
databricks-sql-python copied to clipboard

[Feature Request] Support async execution

Open zhaorui2022 opened this issue 1 year ago • 3 comments

We are currently using https://github.com/databricks/databricks-sql-python/blob/a5b1ab0745bb0a4e917ea800f36ae4b74d079a75/examples/README.md?plain=1#L34 to fetch statement ID when executing queries using sqlalchemy. However, the UUID is not returned at the very end of the query execution. And if the query fails, it never returns the statement ID. Is it possible to support async execution so that we can get the statement ID asap.

zhaorui2022 avatar Jun 26 '24 21:06 zhaorui2022

Is there a timeline for getting proper async support? We would love that to speed up our Databricks workflows! There was clearly some work going on by susodapop (wo seems to have left Databricks) in #322 , #325 and Branch peco-1263-staging -- all of which has stalled 7 months ago.

MeinAccount avatar Aug 09 '24 15:08 MeinAccount

+1 this affects databricks usability in the external systems as it is not possible to navigate to the query when it is running, failed or times out and makes debugging much more complex than it should be

bkyryliuk avatar Aug 14 '24 17:08 bkyryliuk

@rcypher-databricks @yunbodeng-db @andrefurlan-db @jackyhu-db @benc-db @kravets-levko

There is a community desire for async execution for various reasons. #82 and #176

My use case is OpenAI parallel tool calling which can generate multiple queries, but are executed sequentially due to lack of async support.

hayescode avatar Aug 27 '24 22:08 hayescode

We are actively working on this.

gopalldb avatar Nov 15 '24 04:11 gopalldb

@zhaorui2022 cc @deeksha-db We have added support for Async execution from v3.7.0

jprakash-db avatar Feb 17 '25 07:02 jprakash-db

I don't feel like the API is appropriately async. When running a Merge-Statement the call to Cursor.execute_async will wait for the Statement to complete, only then returning to my Python code. I have traced this back to the inclusion of getDirectResults=ttypes.TSparkGetDirectResults(...) in ThriftBackend.executeCommand. Removing this parameter makes the execution actually run async. I have had no luck with changing maxRows or maxBytes (neither 0 nor 1)

MeinAccount avatar Feb 17 '25 07:02 MeinAccount

@MeinAccount We have fixed the issue and it is available in the latest release of the PySQL Connector v3.7.3

jprakash-db avatar Mar 04 '25 09:03 jprakash-db

So is it possible to execute a query using await, are there query functions that are defined with async def ...? Having native python async methods would be really helpful towards integrating properly with the rest of the Python ecosystem

mnussbaum-eq avatar Jul 18 '25 05:07 mnussbaum-eq