talent-plan icon indicating copy to clipboard operation
talent-plan copied to clipboard

[question] In project 4, how to use RwLock to improve performance

Open WindSoilder opened this issue 5 years ago • 3 comments

Regards to project 4.md

In our database that means that all read requests can be satisfied concurrently, but when a single write request comes in, all other activity in the system stops and waits for it. Implementing this is basically as simple as swapping the Mutex for RwLock.

I'm trying to use RwLock first, but fell into trouble that I can't just simple swapping the Mutex for RwLock, because in kvs get method, I can't acquire read lock to inner SharedKvStore object, then use reader.read() relative method to get the result. Because seek and read method on reader need to be mutable..

So it seems that it's not easy to use RwLock for the project? I still can't find a simple way to apply RwLock for the example struct like this:

#[derive(Clone)]
pub struct KvStore(Arc<Mutex<SharedKvStore>>);

#[derive(Clone)]
pub struct SharedKvStore {
    /// Directory for the log and other data
    path: PathBuf,
    /// The log reader
    reader: BufReaderWithPos<File>,
    /// The log writer
    writer: BufWriterWithPos<File>,
    /// The in-memory index from key to log pointer
    index: BTreeMap<String, CommandPos>,
    /// The number of bytes representing "stale" commands that could be
    /// deleted during a compaction
    uncompacted: u64,
}

WindSoilder avatar Jan 29 '20 08:01 WindSoilder

+1 on this, I feel the instruction is not very accurate. You can't have multiple reader on one open file handler, you need to create new file handlers like in example lock free reader , which means for each thread you create a new set of readers (and underlying file handlers). I am not sure if any real system does this, most has buffer pool manager (which would give you RWLatch on page it has read), some uses mmap (which is unsafe). Creating new readers may not be that bad because the operating system has cache and you are likely to run out of thread before running out of file descriptors.

In the end I just skipped most part of project 4, my main take away are internal mutability and how to use channel and threads (didn't learn much on lock free stuff). btw: PingCAP seems to be working a new set of training plans, though I am not sure when it will come out ...

at15 avatar Jan 31 '20 21:01 at15

For now I'm just end up with something like this:

struct InnerStore {
    folder_path: PathBuf,
    index: HashMap<String, u64>,
    useless_cmd: usize,
}

pub struct KvStore {
    inner: Arc<RwLock<InnerStore>>,
}

In each read/write request, I acquire read/write lock to InnerStore, then create new handlers in the method implementation.

WindSoilder avatar Feb 10 '20 02:02 WindSoilder

Looks like I'm not the only one got confused, haha.

As quoted: "Implementing this is basically as simple as swapping the Mutex for RwLock."

Well, IT IS NOT! We need interior mutability for KvStore file, that requires RefCell -- under single thread or protected by Mutex; or RWLock -- shared between threads; but not both.

If we are forced to do so, we can only create new RefCell (and reopen KvStore file) again and again under read lock.

Took me hours to figure it out, lol.

MichaelScofield avatar Jun 11 '20 10:06 MichaelScofield