kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

support Multi-disk(Multi-path)

Open xiaobiaozhao opened this issue 2 years ago • 12 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

Motivation

When the host has multiple disks, multiple disks can be used for data storage to increase the performance of KVROCKS. Hot data can be stored on local SSDS and cold data can be stored on cloud disks

Solution

option.db_paths = {
                     {"/disk1", 1000 * 1000 * 1000},
                     {"/disk2", 1000 * 1000 * 1000},
                     {"/disk3", 1000 * 1000 * 1000},
                     {"/disk4", 1000 * 1000 * 1000}};

https://github.com/facebook/rocksdb/blob/main/include/rocksdb/options.h#L672

Are you willing to submit a PR?

  • [X] I'm willing to submit a PR!

xiaobiaozhao avatar Sep 25 '22 01:09 xiaobiaozhao

Hi @xiaobiaozhao, I have two questions:

  1. Why can multiple disks improve performance? Multiple paths do not seem to work in parallel.
  2. How do we judge hot and cold data in kvrocks? Rocksdb simply determines where to place the SST based on when the SST was generated.

caipengbo avatar Sep 25 '22 01:09 caipengbo

Hi @xiaobiaozhao, I have two questions:

  1. Why can multiple disks improve performance? Multiple paths do not seem to work in parallel.
  2. How do we judge hot and cold data in kvrocks? Rocksdb simply determines where to place the SST based on when the SST was generated.

According to the description of the configuration, the lower level SST will be stored in the front of the db_paths. So we can arrange the db_paths according to the speed of the storage medium, and put the low-level SST in the faster storage medium, for example, put the SSD in the first of the db_paths to storage the low-level SST.

In fact, the level at which SST is located represents the hot and coldness of the data. Because rocksdb uses the LSM tree, it will naturally merge cold data to a higher level.

So if this feature is used, rocksdb can help us store cold data in slower storage media such as mechanical hard drives, and store hot data in faster storage media such as SSD.

tanruixiang avatar Sep 25 '22 15:09 tanruixiang

Hi @xiaobiaozhao, I have two questions:

  1. Why can multiple disks improve performance? Multiple paths do not seem to work in parallel.
  2. How do we judge hot and cold data in kvrocks? Rocksdb simply determines where to place the SST based on when the SST was generated.

According to the description of the configuration, the lower level SST will be stored in the front of the db_paths. So we can arrange the db_paths according to the speed of the storage medium, and put the low-level SST in the faster storage medium, for example, put the SSD in the first of the db_paths to storage the low-level SST.

In fact, the level at which SST is located represents the hot and coldness of the data. Because rocksdb uses the LSM tree, it will naturally merge cold data to a higher level.

So if this feature is used, rocksdb can help us store cold data in slower storage media such as mechanical hard drives, and store hot data in faster storage media such as SSD.

Yes,In my test demo,rocksdb use first & last of the dp_paths config only.

option.db_paths = {
                     {"/disk1", 1000 * 1000 * 1000},
                     {"/disk2", 1000 * 1000 * 1000},
                     {"/disk3", 1000 * 1000 * 1000},
                     {"/disk4", 1000 * 1000 * 1000}};

Only disk1 & disk4 wiil be use to write data. And rocks limit max 4 db_paths.

xiaobiaozhao avatar Sep 25 '22 23:09 xiaobiaozhao

In fact, the level at which SST is located represents the hot and coldness of the data. Because rocksdb uses the LSM tree, it will naturally merge cold data to a higher level.

@tanruixiang But about 90% of the data falls to the last layer of the LSM, so does that mean that 90% of the data is cold?

caipengbo avatar Sep 26 '22 01:09 caipengbo

In fact, the level at which SST is located represents the hot and coldness of the data. Because rocksdb uses the LSM tree, it will naturally merge cold data to a higher level.

@tanruixiang But about 90% of the data falls to the last layer of the LSM, so does that mean that 90% of the data is cold?

Most of the data should be cold data. If it is hot data, it will re-enter the previous layers, and the data in the last layer may be deleted. For example, if key1 is in the last layer and we put key1 again, then key1 will go back to the previous layers after going from mmtable to sst, and at the same time, the key1 of the last layer will be invalid. Of course, if a certain data is only read, it should be placed in the cache even if it is in the last layer.

tanruixiang avatar Sep 26 '22 05:09 tanruixiang

Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?

jishengming1 avatar Jan 30 '23 11:01 jishengming1

Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?

https://github.com/apache/incubator-kvrocks/pull/953/files#diff-e29cedc586b39d07b64be1df007101989a20f7d4452fc18fe23136f6d4ccd331R792

"/mnt/ssd 10G; /mnt/hdd 1T;" hot data cool data

xiaobiaozhao avatar Jan 31 '23 07:01 xiaobiaozhao

Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?

https://github.com/apache/incubator-kvrocks/pull/953/files#diff-e29cedc586b39d07b64be1df007101989a20f7d4452fc18fe23136f6d4ccd331R792

"/mnt/ssd 10G; /mnt/hdd 1T;" hot data cool data

I am not trying to distinguish between hot data and cool data. I mean to make the pressure of both disks the same. Hot data is distributed in two disks.

jishengming1 avatar Jan 31 '23 08:01 jishengming1

Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?

https://github.com/apache/incubator-kvrocks/pull/953/files#diff-e29cedc586b39d07b64be1df007101989a20f7d4452fc18fe23136f6d4ccd331R792 "/mnt/ssd 10G; /mnt/hdd 1T;" hot data cool data

I am not trying to distinguish between hot data and cool data. I mean to make the pressure of both disks the same. Hot data is distributed in two disks.

You can try cluster

xiaobiaozhao avatar Jan 31 '23 14:01 xiaobiaozhao

Hi @xiaobiaozhao , I have a questions: I get two ssd , Is there a way to split the hot data?

https://github.com/apache/incubator-kvrocks/pull/953/files#diff-e29cedc586b39d07b64be1df007101989a20f7d4452fc18fe23136f6d4ccd331R792 "/mnt/ssd 10G; /mnt/hdd 1T;" hot data cool data

I am not trying to distinguish between hot data and cool data. I mean to make the pressure of both disks the same. Hot data is distributed in two disks.

You can try cluster Thanks, is there any other way if there is only one host?

jishengming1 avatar Feb 01 '23 02:02 jishengming1

I get two ssd , Is there a way to split the hot data?

You can make a raid0 from several disks and place the datadir on it. Or you can use zfs pool consisting of several disks.

marlboroman81 avatar Feb 02 '23 16:02 marlboroman81

I get two ssd , Is there a way to split the hot data?

You can make a raid0 from several disks and place the datadir on it. Or you can use zfs pool consisting of several disks.

Thanks, I'll try.

jishengming1 avatar Feb 03 '23 02:02 jishengming1