RECIPE icon indicating copy to clipboard operation
RECIPE copied to clipboard

Crash consistency issue after acquiring bucket locks

Open iangneal opened this issue 3 years ago • 2 comments

Bug

Exposed by crashing after acquiring a lock from clht_put.

https://github.com/utsaslab/RECIPE/blob/fc508ddfae1ca0d77cf3d3f1b73849e65c223f26/P-CLHT/include/clht_lb_res.h#L306-L312

  • Crashing after line 311 here causes the lock to be never released, so the restarted example waits indefinitely

Steps to reproduce

gdb --args ./example 20 1
> break clht_lb_res.h:311
> run
> next
> p *lock
# should print "$1 = 1 '\001'"
> quit
# Then, re-run
./example 20 1

The second execution should run indefinitely, waiting on acquiring the lock.

Comments

I see your comments here about locking assumptions:

https://github.com/utsaslab/RECIPE/blob/fc508ddfae1ca0d77cf3d3f1b73849e65c223f26/P-CLHT/include/clht_lb_res.h#L162-L164

Does this mean this is a known issue, or does clht_lock_initialization just need to be added to clht_create? I ask because it seems that clht_lock_initialization is called in other places, just not in the recovery procedure.

iangneal avatar Apr 28 '21 04:04 iangneal

Hi @Dahca ,

Thanks for the report. We are providing the pmdk version for a reference implementation using pmemobj allocator, but it is not fully tested and has no implementations (such as lock initialization and garbage collection) yet we assumed in our paper. As you also see in my comments, it is a known implementation issue caused by the absence of one of the post-crash mechanisms (lock initialization) we assumed in our paper. clht_lock_initialization was presented as a reference implementation for initializing locks if someone wants to implement post-crash mechanisms. I agree those implementations are necessary to make it properly work for actual use, but I could not find time to work on them yet.

SeKwonLee avatar Apr 28 '21 14:04 SeKwonLee

Hey @SeKwonLee,

Thanks for the quick responses. I can also attempt to add a solution for this in the near future, but as I said in #18, I'll be slightly delayed by an upcoming deadline.

iangneal avatar Apr 28 '21 16:04 iangneal

We close this issue since it is known issue included in one of the assumptions presented in our original paper.

SeKwonLee avatar Apr 13 '23 21:04 SeKwonLee