lmdb-go Deadlock under Linux

Deadlock under Linux

Open xlab opened this issue 7 years ago • 23 comments

Hello! I've been using my own bindings for a while, but recently switched to your package because of handy API and because Go doesn't allow to link the same C symbols under different package names.

The problem is, the code you're using in the package is somehow differs from their tip version on GitHub. I've been using the tip 0.9.70 from here: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h

It has no release date, at first I was confused (see the title change of this issue).

There isn't much difference in the API, except some documentation improve, but it has one big feature called MDB_USE_SYSV_SEM that allows to use different mutex provider under Linux environment.

Without this flag enabled, txn, err := l.env.BeginTxn(nil, 0) may deadlock, not Go kind of deadlock, but some deeply internal deadlock that can be only killed by kill -9. The holds true for Amazon AWS Linux or desktop Arch linux at least.

My advise is to switch to the tip LMDB version (no degradation AFAIK), and enable MDB_USE_SYSV_SEM by default for Linux systems. See the pull request related as an example of such transition - I just replaced C files and fixed types in two places in Go wrapping.

Jan 10 '17 10:01 xlab

Thanks for the input. Someone made #80 I while ago with the same request.

The C library is taken from the mdb.RE/0.9 branch of the openldap repository. AFAIK this is where stable releases are tagged. My understanding from the mailing list is that the master branch is not stable for projects other than openldap to use. If I have the wrong idea please let me know.

On 2016-12-28 LMDB 0.9.19 was tagged on that branch. And I recently updated lmdb-go to use that version.

I don't see MDB_USE_SYSV_SEM in that release though. If you could link me to a relevant mailing list discussion I would appreciate it.

Jan 10 '17 19:01 bmatsuo

@bmatsuo Thanks for review, it seems that MDB_USE_SYSV_SEM feature has not been merged in the stable branch yet, that's why I'm talking about their tip at

https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h
https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c

We can get semaphore backend options only by using their tip. That's the issue, I'm not a big fan of bleeding edges in production, but on the default semaphore backend it deadlocks on many linux systems from valid and safe single-goroutine Go code.

Jan 10 '17 21:01 xlab

I can't really find any useful info about MDB_USE_SYSV_SEM. I can't even find anyone on the mailing list saying that this is a problem. I will continue to look into the issue though.

I see that the python bindings provide an option to use either the stable branch or master. I wonder if something like that would be workable here using tags.

Jan 10 '17 21:01 bmatsuo

Not sure a branch that deadlocks on popular linux distros can be considered "stable", especially where there is no found traits from the "unstable" one.

When I first discovered that issue months ago, I spent 2 days debugging, just by a wild guess and by reading mdb.c code I found the root cause. I think you can search for MDB_USE_SYSV_SEM in mdb.c too, there are some comments about safety, robustness and availability of this backend in lmdb.h.

Jan 11 '17 00:01 xlab

Anyway, talk is cheap. There is a minimal reproducible example: https://gist.github.com/xlab/f7aee266ab741a0412b44ee145ccbc23

$ go get gist.github.com/xlab/f7aee266ab741a0412b44ee145ccbc23.git
$ f7aee266ab741a0412b44ee145ccbc23.git
sync batch 0
sync batch 64000
sync batch 128000
...
sync batch 960000
last put took: 46.385007ms
last key: 001000000
done

The deadlock happens there https://gist.github.com/xlab/f7aee266ab741a0412b44ee145ccbc23#file-lmdb_deadlock-go-L40 with random chance of occurrence, so life is shiny and fun.

The example runs successfully under OS X or on AWS Linux with -DMDB_USE_SYSV_SEM=1

Jan 11 '17 01:01 xlab

The reason why it happens and why the switch makes sense is because of the location where semaphores are kept. POSIX are kept in thread local storage, and SysV are system-wide. See http://stackoverflow.com/questions/368322/differences-between-system-v-and-posix-semaphores

That's why runtime.LockOSThread() also helps to avoid thread switches by goroutines, but you can't use it everywhere in the user code, also it's very error-prone.

Jan 11 '17 01:01 xlab

Reading through the git blame on the source at

https://github.com/LMDB/lmdb/blame/mdb.master/libraries/liblmdb/mdb.c

suggests that the -DMDB_USE_SYSV_SEM was added between 2-3 years ago, so I don't fathom why it doesn't show up in the version being used here.

Jan 22 '17 09:01 glycerine

Thanks for digging @glycerine

suggests that the -DMDB_USE_SYSV_SEM was added between 2-3 years ago

I really need to bring this up on the mailing list... I really thought someone in the world would have done it by now but the only thread about the flag that I found when I looked was one discussing a release of OpenLDAP where someone wanted to use it.

Edit: At this point I'm not sure if this is a bug or an enhancement for this project (feels more like the latter), so I just put both tags on for now.

Jan 28 '17 07:01 bmatsuo

That's why runtime.LockOSThread() also helps to avoid thread switches by goroutines, but you can't use it everywhere in the user code, also it's very error-prone.

@xlab let me get this straight. If you call runtime.LockOSThread() does your example stop deadlocking?

Jan 29 '17 12:01 bmatsuo

Sorry, I fat-fingered by response submission before I was done, @xlab. My point is that, if this is actually deadlocking because your main goroutine is switching threads then that is a documented application design constraint, and the library makes no guarantees about the safety of your data or application. Even in LMDB tip use of the library in such a way is not permitted.

Jan 29 '17 12:01 bmatsuo

your main goroutine is switching threads then that is a documented application design constraint, and the library makes no guarantees about the safety of your data or application.

Odd though. I don't see in the repro gist (https://gist.github.com/xlab/f7aee266ab741a0412b44ee145ccbc23) any use of goroutines. Of course, I suppose the Go runtime is free to move one's go-routines to different threads at will unless one uses LockOSThread.

func LockOSThread()

LockOSThread wires the calling goroutine to its current operating system thread. Until the calling goroutine exits or calls UnlockOSThread, it will always execute in that thread, and no other goroutine can.

Are we to conclude that correct use of lmdb-go requires first calling LockOSThread() from a designated go-routine, and then only accessing the api from that goroutine?

Jan 29 '17 14:01 glycerine

Odd though. I don't see in the repro gist (https://gist.github.com/xlab/f7aee266ab741a0412b44ee145ccbc23) any use of goroutines.

Indeed. I'm sure I would have thought of this more immediately if there were direct use of goroutines.

Are we to conclude that correct use of lmdb-go requires first calling LockOSThread() from a designated go-routine, and then only accessing the api from that goroutine?

I have tried to document this as clearly as possible in various documentation surrounding the BeginTxn, Update, and View methods on Txn. This issue may require me to take another read over those docs.

But yes, write transactions must be confined to a single goroutine which has called LockOSThread(). You can use different goroutines for different write transactions, you just have to ensure that each goroutine that wants to write to the database lock its thread before doing so and never lets the Txn object's methods be called from another goroutine.

The Txn.Update method does its best to help the programmer by locking and unlocking the calling goroutine's thread at the appropriate times. But it also cannot stop the programmer from ignoring documentation and passing a write Txn to another goroutine (as an argument or through closure) and using it unsafely.

LMDB requires care to use (regardless of your language or platform). I believe that to be an inescapable truth. I think it is not too bad to use once you know what kind of operations are safe for your application to do. But one must take time to learn their standard gopher intuition can get them into bad territory.

Jan 29 '17 20:01 bmatsuo

write transactions must be confined to a single goroutine which has called LockOSThread(). You can use different goroutines for different write transactions, you just have to ensure that each goroutine that wants to write to the database lock its thread before doing so and never lets the Txn object's methods be called from another goroutine.

Excellent. Thanks for the clarification.

I agree it would be a kindness to users to mention runtime.LockOSThread in the README and in the intro to the package (that which shows up at the top of the godoc). None of the examples use it either at present.

it's very error-prone.

@xlab: could you elaborate on the wrong ways to use LockOSThread -- what makes it error prone?

Jan 29 '17 21:01 glycerine

I can confirm that the original gist deadlocks on linux, even without the "unsafe" flag lmdb.NoTLS.

However when I add the runtime.LockOSThread() calls in, as in this gist https://gist.github.com/glycerine/dc92cf0e41db6806711d9936fa2d23d1

then it runs to completion.

$ ./lmdb_deadlock_no_more.go 
2017/01/29 21:19:37 sync batch 0
2017/01/29 21:19:37 sync batch 64000
2017/01/29 21:19:37 sync batch 128000
2017/01/29 21:19:37 sync batch 192000
2017/01/29 21:19:37 sync batch 256000
2017/01/29 21:19:37 sync batch 320000
2017/01/29 21:19:38 sync batch 384000
2017/01/29 21:19:38 sync batch 448000
2017/01/29 21:19:38 sync batch 512000
2017/01/29 21:19:38 sync batch 576000
2017/01/29 21:19:38 sync batch 640000
2017/01/29 21:19:38 sync batch 704000
2017/01/29 21:19:38 sync batch 768000
2017/01/29 21:19:38 sync batch 832000
2017/01/29 21:19:38 sync batch 896000
2017/01/29 21:19:38 sync batch 960000
2017/01/29 21:19:38 last put took: 7.966494ms
2017/01/29 21:19:38 last key: 001000000
2017/01/29 21:19:38 done
$

Jan 29 '17 21:01 glycerine

@bmatsuo @glycerine AFAIK, the main goroutine is a goroutine too, so runtime can switch main over different threads too. That's unfortunate.

As for LockOSThread — it behaves like a global mutex that is locked by another person sometimes, and you can't be sure about that. If somebody creates a package API that locks goroutines to threads internally, you must avoid locking, but otherwise you are obligated to use LockOSThread. And if you lock where you really shouldn't, you'll get a deadlock of LockOSThread. If you don't lock where you must, you get a problem with threads and POSIX mutexes in the LMDB internals.

This situation is specifically bad for cases:

you want to call "safe" LMDB methods from goroutines that already bound to thread,
you have limited amount of threads to use,
you have 2 packages, each package exposes set of methods that lock internally, and others don't lock internally, so you must lock manually but carefully avoid locking when using methods that will try to lock internally,
imagine the same but for 3 packages ;)

@glycerine ,

gist deadlocks on linux, even without the "unsafe" flag lmdb.NoTLS.

Like I noticed, it gets locked because POSIX mutexes are bound to threads, they're the cause of deadlocks here. SysV mutexes keep their state system-wide, I guess it's in the kernel.

Jan 29 '17 21:01 xlab

if you lock where you really shouldn't, you'll get a deadlock of LockOSThread. If you don't lock where you must, you get a problem with threads and POSIX mutexes in the LMDB internals.

Thanks for the lucid explanation. It does seem very error prone, indeed, to rely on a coarse, global lock. Much better would be to have goroutine serialization controlled by a library-specific lock.

Jan 29 '17 23:01 glycerine

I can confirm that the original gist deadlocks on linux, even without the "unsafe" flag lmdb.NoTLS.

lmdb.NoTLS will actually have no effect, in general. It is defined for completeness, but general usage of lmdb-go will ensure that the flag is always provided to the underlying C library to minimize the number of places that LockOSThread must be called (which is still non-zero).

https://godoc.org/github.com/bmatsuo/lmdb-go/lmdb#Env.Open

@xlab Your concerns about LockOSThread are absolutely valid. Calling Env.Update from a goroutine that is already locked to a thread is a serious problem. There are a couple things that you can do with lmdb-go.

I have defined to methods, Env.Update and Env.UpdateLocked. If you know you are already locked to a thread you can call Env.UpdateLocked instead of Env.Update and you will not get premature unlocking of the goroutine -- the method's only real purpose is to provide automatic Commit/Rollback of a lmdb.Txn.

https://godoc.org/github.com/bmatsuo/lmdb-go/lmdb#Env.Update

https://godoc.org/github.com/bmatsuo/lmdb-go/lmdb#Env.UpdateLocked

If you truly cannot determine if the thread will be locked at the callsite of Txn.Update you cannot make an update in that goroutine. You may create a new goroutine to issue the update, because we can guarantee that the new goroutine can be safely locked to a thread. Instead of spinning up new goroutines on the spot you may also have a long-lived worker goroutine responsible for issuing all updates, taking lmdb.TxnOp functions over a channel and responding to the sender with the corresponding error (through a different channel). The design of the solution here is up to the application.
If you are going to mix together a whole lot of CGO packages that all use TLS it might be better to look for safer alternatives to some of those packages. Maybe boltdb instead of lmdb-go would fit the use case fine (to my knowledge it does not require thread locking). Maybe a safer version of your other library is OK because it is not as critical for performance. IDK. It's up to the application maintainers. But if you need to mix the operations of these C libraries you are probably going to have become quite careful about how you go about doing that (as I imagine you would in a threaded C application using the underlying libraries).

It's unfortunate that from an API perspective this is all ~the library~ (edit: the lmdb-go package) can provide. Unfortunately, to my knowledge, a goroutine is not able to determine its locked status. It may only call LockOSThread() or UnlockOSThread(), and it is a shortcoming of the API that this can result in premature unlocking (there is no counter on the number of preceding Lock calls, as that also presents problems).

@xlab If sysv semaphores do really allow lmdb-go to eliminate thread locking all together (when combined with the lmdb.NoTLS flag that the package already enforces) that is very interesting, and I would be interested in allowing applications to benefit from that. I definitely think it's worth following up about this on the openldap-technical mailing list where LMDB development is discussed.

@glycerine I will look into providing more insight into goroutine/thread considerations in the top-level README. I will also probably add some clarifications to the documentation for the methods Env.BeginTxn, and Env.View to discuss thread locking.

Thank you both for you help with this issue. I'm going to leave it open until I've merged changes to the docs.

Jan 30 '17 08:01 bmatsuo

It's unfortunate that from an API perspective this is all the library (edit: the lmdb-go package) can provide.

why not just use a simple sync.RWMutex that is lmdb-go specific and leave out the LockOSThread and (all other locking) all together?

Jan 31 '17 00:01 glycerine

@glycerine: In general, it is a requirement of the LMDB C library that all use of a transaction occur on the thread that creates it. Calling LockOSThread cannot be avoided safely as far as I know (though, perhaps, sysv sem is the solution -- as I was saying in my last comment).

(edit: For completeness, I've include the documentation about MDB_NOTLS, because the official docs are holy when using LMDB)

MDB_NOTLS Don't use Thread-Local Storage. Tie reader locktable slots to MDB_txn objects instead of to threads. I.e. mdb_txn_reset() keeps the slot reseved for the MDB_txn object. A thread may use parallel read-only transactions. A read-only transaction may span threads if the user synchronizes its use. Applications that multiplex many user threads over individual OS threads need this option. Such an application must also serialize the write transactions in an OS thread, since LMDB's write locking is unaware of the user threads.

However, this is an interesting point. I have an extremely experimental package, github.com/bmatsuo/lmdb-go/exp/lmdbsync. It is documented in the project's main README. It was developed for other reasons (which you can read about and I won't explain here). But, it does have a sync.RWMutex and it supports the lmdb.NoLock flag, which disables locking in the C library. From the C documentation:

MDB_NOLOCK Don't do any locking. If concurrent access is anticipated, the caller must manage all concurrency itself. For proper operation the caller must enforce single-writer semantics, and must ensure that no readers are using old transactions while a writer is active. The simplest approach is to use an exclusive lock so that no readers may be active at all when a writer begins.

I had not considered that the use of this flag might lift the TLS requirement put on write transactions (all transations). It is another good question for the official mailing list.

Of course, if you were to use my experimental package right now, it would lock its Update transactions to a thread. But this could change if all these questions do pan out.

Jan 31 '17 01:01 bmatsuo

Wow, good to know that locking mechanisms in LMDB C library are so decoupled from the logic that it's possible to turn them off with a simple define override. I hope that the approach of replacing C locking mechanism with Go locking mechanism on the Go side will work as good as it sounds to me! 😉

Jan 31 '17 12:01 xlab

It's worth noting that use of MDB_NOLOCK brings with it caveats that are not present by default. So I'm not sure that its use can ever truly supersede the current, low-level interface in the lmdb package. This is what I mean:

The RWMutex access pattern required when using MDB_NOLOCK is actually more restrictive than vanilla LMDB, which allows read transactions concurrent with write transactions. Secondly, presumably the lack of any locking in LMDB is going to make safe multi-processing impossible (that is, it will not be possible to safely read and write to the database concurrently from multiple OS processes).

I'm not particularly willing to throw out either of these features. So I think the lmdb package is likely going to retain its existing API (with thread locking). But, stabilizing some kind of wrapper that does allow MDB_NOLOCK (like the experimental lmdbsync package) seems possible down the line.

Feb 01 '17 16:02 bmatsuo

The RWMutex access pattern required when using MDB_NOLOCK is actually more restrictive than vanilla LMDB, which allows read transactions concurrent with write transactions.

good point.

So we'd have to roll our own synchronization. Still might be worth it. Probably not that hard. Just look at what lmdb is doing already, and move that "upstream" into the go library.

Feb 01 '17 20:02 glycerine

The RWMutex access pattern required when using MDB_NOLOCK is actually more restrictive than vanilla LMDB, which allows read transactions concurrent with write transactions.

[...] So we'd have to roll our own synchronization.

@glycerine I'm not sure I understand what you mean. I don't think lmdb.h exports an API which would allow a wrapper to properly synchronize an MDB_NOLOCK environment the way LMDB will itself when MDB_NOLOCK is omitted.

I am not particularly interested in patching lmdb.h, mdb.c, or other LMDB files because it will make pulling updates from upsteam LMDB more challenging.

Feb 06 '17 01:02 bmatsuo

lmdb-go lmdb-go copied to clipboard

Deadlock under Linux

lmdb-go
lmdb-go copied to clipboard