databricks-sql-go icon indicating copy to clipboard operation
databricks-sql-go copied to clipboard

[poc] Async support

Open andrefurlan-db opened this issue 3 years ago • 1 comments

PoC on one approach of supporting true async apis.

Goal

The goal is to support server restart while a query is running.

It is not the goal to support parallelism and non-blocking calls. These can easily be supported with go routines with the standard sql API.

API

Function and type names are not final. sql.OpenDB(connector) works the same. No change. This is the synchronous API. dbsql.OpenDB(connector) is new. This is where the asynchronous APis live.

So the user has to explicitly want to interact with asynchronous APIs.

dbsql.OpenDB() returns `DatabricksDB interface.

type DatabricksDB interface {
	QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, Execution, error)
	CancelExecution(ctx context.Context, exc Execution) error
	GetExecutionRows(ctx context.Context, exc Execution) (*sql.Rows, error)
	CheckExecution(ctx context.Context, exc Execution) (Execution, error)
	// .... other relevant sql.DB APIs
	Conn(ctx context.Context) (*sql.Conn, error)
	Close() error
	Stats() sql.DBStats
	SetMaxOpenConns(n int)
	Driver() driver.Driver
	// ...
}

The APIs support DirectResults, which means that calls are synchronous up to 5s, then become asynchronous. The sql.Rows and sql.Result APIs remain exactly the same.

Usage

A typical workflow would be something like:

	rs, exc, err := db.QueryContext(ogCtx, `select * from dummy`)
	if err != nil {
		log.Fatal(err)
	}
	defer rs.Close()
	// ... save exc in cache/db for continuing process in case server restarts

	if exc.Status == dbsql.ExecutionFinished {
		var res string
		for rs.Next() {
			err := rs.Scan(&res)
			// ... so something with the result 
		}
	}
	// ... poll until status is terminal
	for {
		if exc.Status.Terminal() {
			break
		} else {
			exc, err = db.CheckExecution(ogCtx, exc)
			if err != nil {
				log.Fatal(err)
			}
		}
		// must poll at least every 10 minutes
		time.Sleep(time.Second)
	}
	// ... get results
	if exc.Status == dbsql.ExecutionFinished {
		rs, err = db.GetExecutionRows(ogCtx, exc)
		if err != nil {
			log.Fatal(err)
		}
		var res string
		for rs.Next() {
			err := rs.Scan(&res)
			// ... so something with the result 
		}
	} else {
		... handle error cases
	}

andrefurlan-db avatar Nov 22 '22 06:11 andrefurlan-db

one idea to improve would be rs, exc, err = db.CheckExecution(ogCtx, exc). This would make a single call and make the loop simpler

andrefurlan-db avatar Nov 23 '22 20:11 andrefurlan-db