O(n^2) time complexity of adding a lock to transaction
Acquiring a lock in a transaction currently requires iteration of all of the existing locks in order to ensure that situation like this:
TX_BEGIN(pop) {
TX_BEGIN_LOCK(pop, &lock) {
} TX_END
TX_BEGIN_LOCK(pop, &lock) {
} TX_END
} TX_END
does not create a deadlock.
The container of the current active locks must be changed in order to improve performance of this function.
This could be solved by including locks into the ranges tree with a flag denoting the type of a lock. Then, during tx_end, the tree traversal would include an unlock.
This solution actually simplifies the implementation of transaction while significantly improving performance.
Another solution would be to include the lane index into the persistent mutex. The lane index effectively identifies the outermost transaction and could be used to do this. Since mutexes are in fact runtime data held in persistent memory, no layout bump should be necessary.
I think the example does not illustrate the issue, as it acquires the same lock twice, so the complexity is O(n). Any example of the real world scenario when it actually becomes a problem? I.e. an example of a transaction with ~100 nested TX, where each nested TX acquires another lock. Does it happen in our *tree examples?
I like the idea of using lane idx - if only there is some space left in our pmem lock structs.
The example shows why we have to traverse all locks held by the transaction - because our transactions are flattened. For each new lock we do a foreach on the already acquired locks. I don't think this happens in our examples, as we don't take a lot of locks in them and the overhead is negligible. This could be a problem in long running transactions with calls to nested transactions with locking.
I think we still have ample space in the persistent locks.
@krzycz: as Tomek said, the example is to show why we need the iteration.
The time complexity of this operation currently is O(n * (m + n)), where n is the number of locks being added and m is the number of existing locks, hence simplified it's O(n^2).
Sure, I got it. What I meant is that the better example for this would be:
TX_BEGIN(pop) {
for (int i = 0; i < N; i++)
TX_BEGIN_LOCK(pop, &locks[i]) {
/* do something */
} TX_END
} TX_END
What's even worse, if you'd like to grab all the locks at the beginning of the outermost TX, it would still result in pointless foreach loop for each mutex on the list.
TX_BEGIN(pop, TX_PARAM_MUTEX, &lock1, TX_PARAM_MUTEX, &lock2, ... /* N locks here */) {
} TX_END
So, definitely we need to fix/improve this.
Yet another option is to (always/optionally?) treat the pmem locks as recursive locks. Then, we don't need to iterate thru the list of acquired locks. (TX_PARAM_MUTEX_RECURSIVE?)
I see two issues with using recursive mutexes:
- I don't think there are recursive rwlocks on linux
- You have to unlock a recursive mutex exactly the same number of times you lock it In our current flat transaction design number 2 is ~~not~~ fairly easy to do, but 1 isn't. In my opinion being posix-recursive is not the way to go.
So, if understand correctly, we only have recursive locks.
In this example b calls a, and a can successfully alter acquire the lock, which b already did. Right?
void a(void)
{
TX_BEGIN_LOCK(pop, &some_lock) {
some_value++;
} TX_END
}
void b(void)
{
TX_BEGIN_LOCK(pop, &some_lock) {
int x = some_value;
use(x);
a();
use(x);
} TX_END
}
@GBuella : yes
If you consider this question still important to you please reopen the issue and provide more context for your request so we can reassess its priority.