Next and Scan appear to be relatively slow?
Howdy. I'm not sure this really merits a Issues or if it would be better as a conversation some place. In using sqlite3 for a project and I noticed that it's quite a bit slower to select a large number of rows as compared to doing the same via their CLI client. ~4-5x slower. A little profiling shows that the time is mostly spent in Next and Scan. For small row sets it's not a big deal but I'm testing up to ~500K rows.
I suppose my starting question is how much overhead is expected going back and forth between Go and the C API? If it's expected to be fairly fast/low-overhead, any suspicions on where I might start looking?
Here's a sketch of approximately what the code is doing:
https://gist.github.com/mhat/ded97fe58922b4af9d03f375b465c339
There are two variations, one uses the new ScanTypes while the other is cheating with a list of string pointers. Neither appreciably moved the needly forScan.
go's reflect is not fast. go-sqlite3 is wrapper of sqlite3 API. So code works on many layers.
CLI
database <=> sqlite3 API
go-sqlite3
database <=> sqlite3 API <=> cgo binding <=> reflect
Yup. I suppose I was wondering if there's anything than can be done about it. E.g. avoid some of the reflection or a batch-y interface to Next/Scan.
i'm also wondering what's the fastest way to read 1 million rows.
for rows.Next() {
is taking 8 seconds, much too slow (in python a fetchall is taking only 4 seconds)
Are there any faster golang sqlite3 packages?
Since you're forced to use cgo either way,
what about writing a shim layer in C which would implement your minimal batch functionality
by calling sqlite API directly and then returning the results to the Go side?
IMO the code would be fairly cross-platform, if that matters.
(actually its the SCAN within the .NEXT() thats slow)
It would be nice if database/sql could add the equivalent of Python's fetchall. E.g. rows.ScanAll() or similar.
This issue tracks design proposal for a Go 2 version of database/sql, it'd be nice if SQLite-specific quirks were taken into account there, especially the need of a Driver interface that can reduce CGO round trips to a minimum.
This other somehow related issue has an interesting analysis of CGO-related performance penalties that go-sqlite3 is currently bound to incur.
@kostix @robert-king @freeekanayaka @mhat help wanted could someone write a PR proposal which implement such a batch feature ?
I'm interested on a solution, anyone has ideated or used a faster way to rapidly read a bunch of rows without the overhead?
I'm interested on a solution, anyone has ideated or used a faster way to rapidly read a bunch of rows without the overhead?
I believe that if you replace SQLite with a single-node dqlite instance embedded in your application, you'll get a lot faster bulk reads. I didn't test that though.
Basically you'd get rid of CGO, instead your Go application would run a dqlite thread that would open a private Unix socket. Instead of go-sqlite3 you'd use the go-dqlite driver which would connect to that Unix socket and fetch rows much faster because it would be mainly memory copy between threads, no CGO and associated context switches.
Short of that, I think @kostix's idea would be reasonably fast though, although arguably harder and less flexible/maintainable/extendable.
I believe that if you replace SQLite with a single-node dqlite instance embedded in your application, you'll get a lot faster bulk reads. I didn't test that though.
If one is to explore possibilities other than using github.com/mattn/go-sqlite3, I'd certainly recommend looking at https://gitlab.com/cznic/sqlite, which is a full source code level transtation of the stock SQLite to Go, which passes the original test suite (which is certainly a feat even for the original code ;-)).