rocksdb icon indicating copy to clipboard operation
rocksdb copied to clipboard

Recursive read locks in port::RWMutex

Open autopear opened this issue 1 year ago • 2 comments

Existing port::RWMutex has different underlying implementations with different ports. The posix pthread_rwlock_t allows read-locking on the same thread for multiple times

A thread may hold multiple concurrent read locks on rwlock (that is, successfully call the pthread_rwlock_rdlock() function n times). If so, the thread must perform matching unlocks (that is, it must call the pthread_rwlock_unlock() function n times).

but this may not the case for other ports. For example, SRWLOCK in Windows does not support recursive read-locking

Shared mode SRW locks should not be acquired recursively as this can lead to deadlocks when combined with exclusive acquisition.

Also, the behavior of std::shared_mutex::lock_shared (though not used in RocksDB) is undefined if the same mutex is read-locked multiple times on the same thread.

If lock_shared is called by a thread that already owns the mutex in any mode (exclusive or shared), the behavior is undefined.

IMO, we should not assume recursive read (or write) locks of port::RWMutex is always allowed.

Currently there are some usages of port::RWMutex that may have recursive read-locks in the same thread. We have found this at least in WriteableCacheFile.

bool RandomAccessCacheFile::Read(const LBA& lba, Slice* key, Slice* val,
                                 char* scratch) override {
  ReadLock _(&rwlock_);

  assert(lba.cache_id_ == cache_id_);

  if (!freader_) {
    return false;
  }

  Slice result;
  Status s = freader_->Read(IOOptions(), lba.off_, lba.size_, &result, scratch,
                            nullptr, Env::IO_TOTAL /* rate_limiter_priority */);
  if (!s.ok()) {
    Error(log_, "Error reading from file %s. %s", Path().c_str(),
          s.ToString().c_str());
    return false;
  }

  assert(result.data() == scratch);

  return ParseRec(lba, key, val, scratch);
}

bool WriteableCacheFile::Read(const LBA& lba, Slice* key, Slice* block, char* scratch) override {
  ReadLock _(&rwlock_);
  const bool closed = eof_ && bufs_.empty();
  if (closed) {
    // the file is closed, read from disk
    return RandomAccessCacheFile::Read(lba, key, block, scratch);
  }
  // file is still being written, read from buffers
  return ReadBuffer(lba, key, block, scratch);
}

If the file has already been closed, rwlock_ is read-locked first in WriteableCacheFile::Read and then read-locked again in RandomAccessCacheFile::Read, this may cause deadlocks on some non-posix systems or alternative port implementations.

autopear avatar Nov 05 '24 18:11 autopear

@autopear is this issue solved ? can I solve this issue ?

kaustubh2332 avatar Nov 08 '25 22:11 kaustubh2332

@kaustubh2332 I believe it's not. Feel free to submit a PR to fix it.

autopear avatar Nov 10 '25 18:11 autopear