XIA-for-Linux
XIA-for-Linux copied to clipboard
xia/lpm: remove routing/forwarding deadlock
Due to the way that locking was written into this toy LPM implementation, the read-only code in lpm_deliver() obtains a write lock on the tree FIB. The effect of this is the unnatural combination of first obtaining an RCU read lock and then a FIB write lock in lpm_deliver(). This conflicts with functions like local_newroute(), which first obtain a FIB write lock and then wait for RCU readers to finish before flushing anchors. The problem is that deadlock can occur with the following interleaving:
thread1 thread2
======= =======
local_newroute(): get write lock
lpm_deliver(): get RCU read lock
local_newroute(): wait for RCU readers
lpm_deliver(): wait for write lock
Notice that this deadlock cannot occur if lpm_deliver() gets the write lock first.
To fix this, we can duplicate the FIB entry whose anchor needs to be flushed, replace the old entry with the duplicate, and then release the lock. This allows writers of the tree, including the code in lpm_deliver(), to proceed. We then wait an RCU synchronization to flush the old entry's anchor, since routing mechanism code (not involved with the LPM tree) may still be reading that entry. Once all these readers are done, the old entry can be reclaimed.
This removes the deadlock, but future iterations of the LPM principal should use RCU instead of rwlocks to avoid this unnatural locking.
Hi Cody,
I found a rare bug that can occur at local_newroute() or main_newroute().
Consider the following execution:
- tree_fib_newroute_lock() returns successfully.
- newroute_flush_anchor_locked() fails.
- fxid_free_norcu() is going to release an item that is live in the tree due to step 1.
Thanks, Michel. I just rebased so that the patch has newroute_flush_anchor_locked()
remove the entry if it's going to fail.
The pull request is ready for review again. I separated the other bug found into a separate patch, since it is a distinct change that affected multiple files.
Patch xia/lpm: remove routing/forwarding deadlock is ready for merge.