go-sqlite3 icon indicating copy to clipboard operation
go-sqlite3 copied to clipboard

Make sqlite3.Open thread safe

Open pzbitskiy opened this issue 5 years ago • 10 comments

  • sqlite3_open_v2 implicitly calls sqlite3_initialize
  • sqlite3_initialize is not thread safe in the prologue
  • concurrent calls of sqlite3.Open even on different files may trigger mutex initialization race and crash
  • Fix: use go channel with a single message in it as a barrier letting only single-threaded sqlite3_open_v2 for the first time
  • Reproducibility: very low, couple hours of concurrent sqlite3.Open calls trigger a single crash

Sample crash dump:

Thread 1 (Thread 0x7fffee7fc700 (LWP 12250)):
#0  0x0000000000000000 in ?? ()
#1  0x0000000000ba9a55 in sqlite3_initialize () at sqlite3-binding.c:151936
#2  0x0000000000bebd49 in openDatabase (zFilename=0x7fffd8000b20 "file:gen/rootkey?_busy_timeout=1000&_synchronous=full&_journal_mode=wal", 
    ppDb=0xc0000b0488, flags=<optimized out>, zVfs=0x0) at sqlite3-binding.c:154720
#3  0x0000000000bec6f5 in sqlite3_open_v2 (filename=<optimized out>, ppDb=<optimized out>, flags=<optimized out>, zVfs=<optimized out>)
    at sqlite3-binding.c:155075
#4  0x0000000000b8cb35 in _sqlite3_open_v2 (zVfs=<optimized out>, flags=<optimized out>, ppDb=<optimized out>, filename=<optimized out>)
    at /home/ubuntu/go/pkg/mod/github.com/mattn/[email protected]/sqlite3.go:53
#5  _cgo_210ed2050b13_Cfunc__sqlite3_open_v2 (v=0xc0000d2bc0) at cgo-gcc-prolog:151
#6  0x0000000000460298 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:635

Relevant code for sqlite3-bindings.c

24769	#ifdef SQLITE_ENABLE_MULTITHREADED_CHECKS
24770	      pFrom = multiThreadedCheckMutex();
24771	#else
24772	      pFrom = sqlite3DefaultMutex();
24773	#endif
24774	    }else{
24775	      pFrom = sqlite3NoopMutex();
24776	    }
24777	    pTo->xMutexInit = pFrom->xMutexInit;
   0x0000000000b8d61e <+222>:	mov    %rax,0xfc40cb(%rip)        # 0x1b516f0 <sqlite3Config+112>

24787	  }
24788	  assert( sqlite3GlobalConfig.mutex.xMutexInit );
24789	  rc = sqlite3GlobalConfig.mutex.xMutexInit();
   0x0000000000b8d625 <+229>:	jmpq   *%rax

pzbitskiy avatar Feb 28 '20 00:02 pzbitskiy

This repo is a Go wrapper around the SQLite C library. If you believe you have found a bug in the C library, you should report it on their mailing list and let one of the SQLite maintainers make the proper fix. https://www.sqlite.org/support.html

rittneje avatar Feb 28 '20 00:02 rittneje

I understand. Comments above ‘sqlite3_initialize’ clearly state that it is a caller’s responsibility to init SQLite in a thread safe manner.

On Feb 27, 2020, at 19:35, rittneje [email protected] wrote:

 This repo is a Go wrapper around the SQLite C library. If you believe you have found a bug in the C library, you should report it on their mailing list and let one of the SQLite maintainers make the proper fix. https://www.sqlite.org/support.html

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

pzbitskiy avatar Feb 28 '20 01:02 pzbitskiy

Coverage Status

Coverage increased (+0.2%) to 51.678% when pulling 28b23918db6ac5d91233bf723a72eed19af3b5d9 on pzbitskiy:initialize-fix into 67986a7832dcbda29de5c1341b3dec6c6d97511a on mattn:master.

coveralls avatar Feb 28 '20 01:02 coveralls

Comments above ‘sqlite3_initialize’ clearly state that it is a caller’s responsibility to init SQLite in a thread safe manner.

The docs say otherwise. https://www.sqlite.org/c3ref/initialize.html

The sqlite3_initialize() interface is threadsafe

rittneje avatar Feb 28 '20 02:02 rittneje

It might be thread safe in master - some memory barriers are added to sync up caches and memory.

In the same time the comment above the function:

** The first thread to call this routine runs the initialization to
** completion.  If subsequent threads call this routine before the first
** thread has finished the initialization process, then the subsequent
** threads must block until the first thread finishes with the initialization.

Feel free to close, your call. The only thing go-sqlite3 1.10 (with sqlite 3.25.2) crashes on concurrent Open calls.

pzbitskiy avatar Feb 28 '20 02:02 pzbitskiy

I believe that comment is describing what the implementation of sqlite3_initialize does. If they were intended to be preconditions on the caller, I imagine they would have been in the public documentation. Like I said earlier, if you believe you have found a bug in SQLite, you should post your issue on their mailing list. I don't believe that this repo is the right place for a fix to the kind of problem you are experiencing. But I'm also not the owner of this repo, just giving my two cents.

rittneje avatar Feb 28 '20 03:02 rittneje

@tsachiherman Care to elaborate on the downvote? I still contend this change should not be merged, given that the claim is that there is a bug in the SQLite library itself.

rittneje avatar Jul 11 '20 03:07 rittneje

@pzbitskiy Why do you believe those lines you posted from sqlite3-bindings.c to be relevant? They aren't the line numbers from your crash dump. Did you come to those lines from some deeper investigating you did? Also, can you post how you are calling sql.Open?

rittneje avatar Jul 11 '20 03:07 rittneje

Unfortunately I can't remember all the details after 5 months. But here is what I recalled.

They aren't the line numbers from your crash dump.

They are:

  1. sqllite3_initialize:
 151936   rc = sqlite3MutexInit();
 151937   if( rc ) return rc;
  1. sqlite3MutexInit:
  24757 SQLITE_PRIVATE int sqlite3MutexInit(void){ 
  24758   int rc = SQLITE_OK;
...
  24768     if( sqlite3GlobalConfig.bCoreMutex ){
  1. Zero address on top of the stack tells us about a jump to NULL address.
  2. Previous item in the call stack points to sqlite3MutexInit.
  3. There is only one call by a pointer there:
24789	  rc = sqlite3GlobalConfig.mutex.xMutexInit();
   0x0000000000b8d625 <+229>:	jmpq   *%rax
  1. And yes, RAX is zero and assert is stripped out in release build.

Now follow the path:

  1. Two threads enter sqllite3_initialize and hit sqlite3MutexInit
  2. Next, inside sqlite3MutexInit they execute these assignments concurrently.
  3. I understand how one thread could see NULL in xMutexInit in case of no memory barriers but there is one:
  24784     pTo->xMutexNotheld = pFrom->xMutexNotheld;
  24785     sqlite3MemoryBarrier();
  24786     pTo->xMutexAlloc = pFrom->xMutexAlloc;

I think I had explanation about it but can't figure it out.

pzbitskiy avatar Jul 11 '20 05:07 pzbitskiy

@tsachiherman Care to elaborate on the downvote? I still contend this change should not be merged, given that the claim is that there is a bug in the SQLite library itself.

@rittneje, try to look on this issue from an application perspective - the applicative stack would look like this: [Application] -> [Go SQL library] -> [mattn sqlite3 driver] -> [sqlite3 library]

So, from workaround perspective:

  1. At this point in time, if an application developer want to release a product that doesn't have this issue, he/she would need to patch it in the application itself.
  2. Google aren't going to do anything about this issue in their SQL library.
  3. The mattn sqlite3 driver isn't the source of the issue, according to your claim.
  4. sqlite3 library (according to you) need to be fixed.

I'm not claiming that it shouldn't be fixed in the sqlite3 library itself. I think it should. However, from practical perspective, even if the sqlite3 library would get updated, it will take a long time before the fix reaches everyone.

That's why I think that in the case of such a trivial fix, it should be patched in the driver right away. With that, the bug would no longer be applicable, and the driver would be safe for usage. This workaround can be removed from the driver once the sqlite3 library would confirm to its own documentation.

Another note - even if you were to report that to the sqlite3 library team, they might fix it by updating their documentation.. which would leave you at the same place.

tsachiherman avatar Jul 11 '20 15:07 tsachiherman