Make sqlite3.Open thread safe
- sqlite3_open_v2 implicitly calls sqlite3_initialize
- sqlite3_initialize is not thread safe in the prologue
- concurrent calls of sqlite3.Open even on different files may trigger mutex initialization race and crash
- Fix: use go channel with a single message in it as a barrier letting only single-threaded sqlite3_open_v2 for the first time
- Reproducibility: very low, couple hours of concurrent sqlite3.Open calls trigger a single crash
Sample crash dump:
Thread 1 (Thread 0x7fffee7fc700 (LWP 12250)):
#0 0x0000000000000000 in ?? ()
#1 0x0000000000ba9a55 in sqlite3_initialize () at sqlite3-binding.c:151936
#2 0x0000000000bebd49 in openDatabase (zFilename=0x7fffd8000b20 "file:gen/rootkey?_busy_timeout=1000&_synchronous=full&_journal_mode=wal",
ppDb=0xc0000b0488, flags=<optimized out>, zVfs=0x0) at sqlite3-binding.c:154720
#3 0x0000000000bec6f5 in sqlite3_open_v2 (filename=<optimized out>, ppDb=<optimized out>, flags=<optimized out>, zVfs=<optimized out>)
at sqlite3-binding.c:155075
#4 0x0000000000b8cb35 in _sqlite3_open_v2 (zVfs=<optimized out>, flags=<optimized out>, ppDb=<optimized out>, filename=<optimized out>)
at /home/ubuntu/go/pkg/mod/github.com/mattn/[email protected]/sqlite3.go:53
#5 _cgo_210ed2050b13_Cfunc__sqlite3_open_v2 (v=0xc0000d2bc0) at cgo-gcc-prolog:151
#6 0x0000000000460298 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:635
Relevant code for sqlite3-bindings.c
24769 #ifdef SQLITE_ENABLE_MULTITHREADED_CHECKS
24770 pFrom = multiThreadedCheckMutex();
24771 #else
24772 pFrom = sqlite3DefaultMutex();
24773 #endif
24774 }else{
24775 pFrom = sqlite3NoopMutex();
24776 }
24777 pTo->xMutexInit = pFrom->xMutexInit;
0x0000000000b8d61e <+222>: mov %rax,0xfc40cb(%rip) # 0x1b516f0 <sqlite3Config+112>
24787 }
24788 assert( sqlite3GlobalConfig.mutex.xMutexInit );
24789 rc = sqlite3GlobalConfig.mutex.xMutexInit();
0x0000000000b8d625 <+229>: jmpq *%rax
This repo is a Go wrapper around the SQLite C library. If you believe you have found a bug in the C library, you should report it on their mailing list and let one of the SQLite maintainers make the proper fix. https://www.sqlite.org/support.html
I understand. Comments above ‘sqlite3_initialize’ clearly state that it is a caller’s responsibility to init SQLite in a thread safe manner.
On Feb 27, 2020, at 19:35, rittneje [email protected] wrote:
This repo is a Go wrapper around the SQLite C library. If you believe you have found a bug in the C library, you should report it on their mailing list and let one of the SQLite maintainers make the proper fix. https://www.sqlite.org/support.html
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Coverage increased (+0.2%) to 51.678% when pulling 28b23918db6ac5d91233bf723a72eed19af3b5d9 on pzbitskiy:initialize-fix into 67986a7832dcbda29de5c1341b3dec6c6d97511a on mattn:master.
Comments above ‘sqlite3_initialize’ clearly state that it is a caller’s responsibility to init SQLite in a thread safe manner.
The docs say otherwise. https://www.sqlite.org/c3ref/initialize.html
The sqlite3_initialize() interface is threadsafe
It might be thread safe in master - some memory barriers are added to sync up caches and memory.
In the same time the comment above the function:
** The first thread to call this routine runs the initialization to
** completion. If subsequent threads call this routine before the first
** thread has finished the initialization process, then the subsequent
** threads must block until the first thread finishes with the initialization.
Feel free to close, your call. The only thing go-sqlite3 1.10 (with sqlite 3.25.2) crashes on concurrent Open calls.
I believe that comment is describing what the implementation of sqlite3_initialize does. If they were intended to be preconditions on the caller, I imagine they would have been in the public documentation. Like I said earlier, if you believe you have found a bug in SQLite, you should post your issue on their mailing list. I don't believe that this repo is the right place for a fix to the kind of problem you are experiencing. But I'm also not the owner of this repo, just giving my two cents.
@tsachiherman Care to elaborate on the downvote? I still contend this change should not be merged, given that the claim is that there is a bug in the SQLite library itself.
@pzbitskiy Why do you believe those lines you posted from sqlite3-bindings.c to be relevant? They aren't the line numbers from your crash dump. Did you come to those lines from some deeper investigating you did? Also, can you post how you are calling sql.Open?
Unfortunately I can't remember all the details after 5 months. But here is what I recalled.
They aren't the line numbers from your crash dump.
They are:
sqllite3_initialize:
151936 rc = sqlite3MutexInit();
151937 if( rc ) return rc;
sqlite3MutexInit:
24757 SQLITE_PRIVATE int sqlite3MutexInit(void){
24758 int rc = SQLITE_OK;
...
24768 if( sqlite3GlobalConfig.bCoreMutex ){
- Zero address on top of the stack tells us about a jump to NULL address.
- Previous item in the call stack points to
sqlite3MutexInit. - There is only one call by a pointer there:
24789 rc = sqlite3GlobalConfig.mutex.xMutexInit();
0x0000000000b8d625 <+229>: jmpq *%rax
- And yes, RAX is zero and assert is stripped out in release build.
Now follow the path:
- Two threads enter
sqllite3_initializeand hit sqlite3MutexInit - Next, inside
sqlite3MutexInitthey execute these assignments concurrently. - I understand how one thread could see NULL in
xMutexInitin case of no memory barriers but there is one:
24784 pTo->xMutexNotheld = pFrom->xMutexNotheld;
24785 sqlite3MemoryBarrier();
24786 pTo->xMutexAlloc = pFrom->xMutexAlloc;
I think I had explanation about it but can't figure it out.
@tsachiherman Care to elaborate on the downvote? I still contend this change should not be merged, given that the claim is that there is a bug in the SQLite library itself.
@rittneje, try to look on this issue from an application perspective - the applicative stack would look like this: [Application] -> [Go SQL library] -> [mattn sqlite3 driver] -> [sqlite3 library]
So, from workaround perspective:
- At this point in time, if an application developer want to release a product that doesn't have this issue, he/she would need to patch it in the application itself.
- Google aren't going to do anything about this issue in their SQL library.
- The mattn sqlite3 driver isn't the source of the issue, according to your claim.
- sqlite3 library (according to you) need to be fixed.
I'm not claiming that it shouldn't be fixed in the sqlite3 library itself. I think it should. However, from practical perspective, even if the sqlite3 library would get updated, it will take a long time before the fix reaches everyone.
That's why I think that in the case of such a trivial fix, it should be patched in the driver right away. With that, the bug would no longer be applicable, and the driver would be safe for usage. This workaround can be removed from the driver once the sqlite3 library would confirm to its own documentation.
Another note - even if you were to report that to the sqlite3 library team, they might fix it by updating their documentation.. which would leave you at the same place.