Slow performance on bulk insert
Bulk inserts appear to take quadratic time over the number of statements being processed. Using v1.14.16, I find that a bulk insert that finishes quickly for the sqlite3 shell does not finish at all when run in golang.
My supposition is that this is because the sqlite3_prepare_v2 function returns a pointer to the remaining unprocessed sql, which we then copy back into a Go string. On the next run of the loop in exec(), we do another roundtrip of copying strings Go->C->Go. By contrast, the sqlite3 shell performs the same task without copying the underlying buffer, and finishes significantly faster.
A representative stack trace shows indicates time spent in runtime.memmove():
runtime.memmove()
src/runtime/memmove_amd64.s:432 +0x513 fp=0xc0005db3d0 sp=0xc0005db3c8 pc=0x19fca13
github.com/mattn/go-sqlite3._Cfunc_CString({0xc00a3d6001, 0x9254ac8})
_cgo_gotypes.go:200 +0x7b fp=0xc0005db400 sp=0xc0005db3d0 pc=0x27ee11b
github.com/mattn/go-sqlite3.(*SQLiteConn).prepare(0xc0010ca000, {0xb2ff30?, 0xc00021a008?}, {0xc00a3d6001?, 0x0?})
rh/external/com_github_mattn_go_sqlite3/sqlite3.go:1805 +0x5e fp=0xc0005db490 sp=0xc0005db400 pc=0x27fff3e
github.com/mattn/go-sqlite3.(*SQLiteConn).exec(0x0?, {0xb2ff30, 0xc00021a008}, {0xc0177da000?, 0x0?}, {0x3433fd0, 0x0, 0x0})
rh/external/com_github_mattn_go_sqlite3/sqlite3.go:853 +0x99 fp=0xc0005db5d8 sp=0xc0005db490 pc=0x27f9939
github.com/mattn/go-sqlite3.(*SQLiteConn).ExecContext(0xc0005db6c8?, {0xb2ff30, 0xc00021a008}, {0xc0177da000, 0x9255f37}, {0x3433fd0, 0x0, 0xc0010ca000?})
sqlite3_go18.go:41 +0xbb fp=0xc0005db670 sp=0xc0005db5d8 pc=0x27edb5b
Is this covered by #1133?
Yes!
This isn't an Go-SQLite3 issue. I think it's the expected and normal behavior. The default SQLite3 settings are set to safe, and slow.
@josnyder-rh I had the exact same problem on my project (which sole SQLite3 usage is bulk inserts). Google lead me to this amazing C++ SQLite3 performance guide and fortunately, the same techniques worked for me.
Here's the 2 line change that fixed it for me.
I wrote a standalone sqlite3.go benchmark file to help me understand this situation. It's the simplest way to explain what's going on.
Basically the idea is to:
- Use transactions, to batch updates in a single database commit (I didn't need it).
- Disable SQLite3 synchronous mode, which block until the commit is fully writen to disk.
- Set journal_mode to memory.
On production what you want to do is to create a new DB connection just for the bulk inserts, and change the PRAGMA for just those operations.
For me the result was importing 350,000 lines at 10KB/s (13min) to 3.2MB/s (2sec).
I've fully detailed my whole resolution process on issue #1, should you want more information.