r2d2 icon indicating copy to clipboard operation
r2d2 copied to clipboard

Unwinding can result in connections with broken invariants being returned to the connection pool

Open sgrif opened this issue 8 years ago • 11 comments

Currently PooledConnection unconditionally returns the connection to the pool on drop. Since a panic could happen at any time, we can't assume that the connection will be in a good state if it's being dropped as the result of unwinding. r2d2 should either check thread::panicking before checking the connection back into the pool, or should have a T: UnwindSafe constraint on the connection.

sgrif avatar Mar 06 '17 19:03 sgrif

Is this in reference to a connection implementation that will be in a bad state during a panic, or is this more of a general principle of the thing?

sfackler avatar Mar 07 '17 02:03 sfackler

I don't have a concrete example of a connection which has broken invariants when a panic occurs if that's what you mean. I'm sure I can find one pretty easily if you really don't believe that those cases exist.

sgrif avatar Mar 13 '17 17:03 sgrif

The fact that I'm not aware of any instance in which this has been a problem in the last several years guides my instincts to some extent 😃 .

That being said, I don't think I have a problem dropping connections that are checked back in after a panic. It can always be made configurable if that impacts someone's use case negatively.

sfackler avatar Mar 15 '17 17:03 sfackler

The fact that I'm not aware of any instance in which this has been a problem in the last several years

catch_unwind has been stable for less than a year. :smile:

sgrif avatar Mar 15 '17 18:03 sgrif

I don't see how that's really related? Other kinds of "panic safety" like mutex poisoning has been around for quite a long time.

sfackler avatar Mar 15 '17 18:03 sfackler

I just ran into this issue - it's definitely not theoretical, and has some pretty nasty consequences: namely transactions may not be closed correctly when a thread panics. Since database locks are held until the transaction is closed, this can result in the entire system locking up indefinitely.

I believe the "most correct" solution here would be to require manually returning the connection to the pool, because using thread::panicking is not quite correct (the connection may have been opened whilst unwinding, in which case it is safe to return).

For backwards compatibility, perhaps you could add a "discard" method and an "unwind-safe" wrapper which automatically discards the connection when dropped unless you explicitly return it to the pool.

Diggsey avatar Apr 17 '18 21:04 Diggsey

Hm, on second thoughts, the thread::panicking solution is probably "good enough" (tm) - the remaining functionality can be implemented on the underlying connection type using has_broken.

Diggsey avatar Apr 18 '18 12:04 Diggsey

We're running into this issue in production.

Does anyone know a workaround that works now?

Boscop avatar Apr 24 '20 10:04 Boscop

Can you not use RAII for your transaction management?

sfackler avatar Apr 24 '20 11:04 sfackler

@sfackler You mean rolling back the transaction on Drop if the drop was caused by a panic?

Boscop avatar Apr 24 '20 11:04 Boscop

I mean rolling back the connection on Drop if the transaction hasn't been committed: https://github.com/sfackler/rust-postgres/blob/master/tokio-postgres/src/transaction.rs#L30

sfackler avatar Apr 24 '20 12:04 sfackler