[poc] Async support
PoC on one approach of supporting true async apis.
Goal
The goal is to support server restart while a query is running.
It is not the goal to support parallelism and non-blocking calls. These can easily be supported with go routines with the standard sql API.
API
Function and type names are not final.
sql.OpenDB(connector) works the same. No change. This is the synchronous API.
dbsql.OpenDB(connector) is new. This is where the asynchronous APis live.
So the user has to explicitly want to interact with asynchronous APIs.
dbsql.OpenDB() returns `DatabricksDB interface.
type DatabricksDB interface {
QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, Execution, error)
CancelExecution(ctx context.Context, exc Execution) error
GetExecutionRows(ctx context.Context, exc Execution) (*sql.Rows, error)
CheckExecution(ctx context.Context, exc Execution) (Execution, error)
// .... other relevant sql.DB APIs
Conn(ctx context.Context) (*sql.Conn, error)
Close() error
Stats() sql.DBStats
SetMaxOpenConns(n int)
Driver() driver.Driver
// ...
}
The APIs support DirectResults, which means that calls are synchronous up to 5s, then become asynchronous. The sql.Rows and sql.Result APIs remain exactly the same.
Usage
A typical workflow would be something like:
rs, exc, err := db.QueryContext(ogCtx, `select * from dummy`)
if err != nil {
log.Fatal(err)
}
defer rs.Close()
// ... save exc in cache/db for continuing process in case server restarts
if exc.Status == dbsql.ExecutionFinished {
var res string
for rs.Next() {
err := rs.Scan(&res)
// ... so something with the result
}
}
// ... poll until status is terminal
for {
if exc.Status.Terminal() {
break
} else {
exc, err = db.CheckExecution(ogCtx, exc)
if err != nil {
log.Fatal(err)
}
}
// must poll at least every 10 minutes
time.Sleep(time.Second)
}
// ... get results
if exc.Status == dbsql.ExecutionFinished {
rs, err = db.GetExecutionRows(ogCtx, exc)
if err != nil {
log.Fatal(err)
}
var res string
for rs.Next() {
err := rs.Scan(&res)
// ... so something with the result
}
} else {
... handle error cases
}
one idea to improve would be rs, exc, err = db.CheckExecution(ogCtx, exc). This would make a single call and make the loop simpler