Providing access to `colNames` from `sqlite.Stmt`
Would it be feasible to allow or provide access to Stmt.colNames? I'm returning slices of values within a ResultFunc (to avoid the overhead of repeated maps) so having the column name to index lookup would be useful.
I'm currently iterating through the ColumnCount() and pulling out stmt.ColumnName(i) one-by-one - easy enough to do, but duplicates the logic and work already done within Conn.prepare. I recognise colNames is used internally by the various getters so modification were a concern, perhaps a function like ColumnNames() might return a clone of the map instead.
Thanks for the consideration.
(I had to remind myself whether this had come up before, and I was mistakenly remembering #95, which is about binding names instead of column names.)
Providing direct access would be problematic because the data structure could change internally, as you say. I had a cursory glance over the code, and I'm dismayed to discover that prepare does the work and then ignores it for *Stmt.ColumnName, allocating every time: https://github.com/zombiezen/go-sqlite/blob/ac68309514835610d4ec5c620278cbc5ff09003d/sqlite.go#L871-L873
For both binding names and column names, I think the accessor methods are already the right API. However, in the case of column names, there is an (unintentional) performance penalty for using them. IMO fixing that would be better.
I know that wasn't exactly the issue you raised, so I want to double-check: if the library were to change such that calling *Stmt.ColumnName(i) and *Stmt.ColumnCount() in the way you are already using it is as efficient as accessing *Stmt.colNames directly, would there still be a concern? (Or asked in a different way: is there anything beyond what those two methods provide you're hoping to achieve by having direct access to colNames?)
From ResultFunc() I currently hand back a map of column names to indexes (e.g. the same data that's in colNames - a map[string]int) alongside slices of result rows (e.g. [][]any). This provides a column name reference for when stmt is no longer available later - in different applications. In my case, colNames and result rows are handed off as JSON to other processes, but because I'm using sqlitex.Execute, stmt doesn't appear available outside of ResultFunc(). So, this wouldn't appear to provide a solution; the performance improvements to Stmt would still be beneficial of course.
On my data structures, I was previously returning a map[string]any for each row, keyed on column name string; this allowed a simple column name-to-data lookup for rows, but the processing and memory overhead of repeated maps is significant. Hence, creating a single colNames map whilst minimising the other data structures.
Here's roughly what my code looks like. As I was saying, getting colNames out requires a bit of finessing access to stmt, so API improvements that tie into execution via sqlitex would be greatly appreciated.
type Row = []any
type Rows = []Row
type ColNames = map[string]int
func execute(sql string) (Rows, ColNames, error) {
var rows Rows
colNames := ColNames{}
firstRow := true
execOptions := &sqlitex.ExecOptions{
ResultFunc: func(stmt *sqlite.Stmt) error {
if dataCount := stmt.DataCount(); dataCount != 0 {
row := make([]any, dataCount)
for i := range dataCount {
row[i] = ColumnAny(stmt, i) // Custom function to retrieve values as `any` type
if firstRow {
colNames[stmt.ColumnName(i)] = i
}
}
rows = append(rows, row)
firstRow = false
}
return nil
},
}
conn, err := sqlite.OpenConn("foobar.db", sqlite.OpenReadOnly)
if err != nil {
return nil, nil, err
}
sqlitex.Execute(conn, sql, execOptions)
return rows, colNames, nil
}
Thanks for the code snippet; this helped me understand much better. You're hitting friction with sqlitex.Execute because it is intended for simpler queries. Once you get to the point where you're doing sophisticated introspection like you are now, it's usually clearer to use the *sqlite.Stmt API directly. I'd rewrite that code like this (untested):
package main
import "zombiezen.com/go/sqlite"
type Row = []any
type Rows = []Row
type ColNames = map[string]int
func execute(sql string) (Rows, ColNames, error) {
conn, err := sqlite.OpenConn("foobar.db", sqlite.OpenReadOnly)
if err != nil {
return nil, nil, err
}
stmt, _, err := conn.PrepareTransient(sql)
if err != nil {
return nil, nil, err
}
defer stmt.Finalize()
colNames := make(ColNames)
for i, n := 0, stmt.ColumnCount(); i < n; i++ {
colNames[stmt.ColumnName(i)] = i
}
var rows Rows
for {
hasRow, err := stmt.Step()
if err != nil {
return rows, colNames, err
}
if !hasRow {
break
}
dataCount := stmt.DataCount()
if dataCount == 0 {
continue
}
row := make([]any, dataCount)
for i := range dataCount {
row[i] = ColumnAny(stmt, i) // Custom function to retrieve values as `any` type
}
rows = append(rows, row)
}
return rows, colNames, nil
}
And thus, the improvement I suggested in my previous message would be applicable here. I've opened #118 for that.
Thanks for the suggestion. I didn't capture it in my example but I'm also using sqlitex's arg and named arg handling, so switching to using *sqlite.Stmt directly would mean needing to copy sqlitex's various private functions.
Not impossible of course, but if there was to be a callback in sqlitex.exec that was passed the stmt, this would provide access to the column names. Could having something like this be useful for other purposes ahead of query execution (e.g. customised bindings, other introspection etc)?