ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-7199. Implement new mix workload Read/Write Freon command which meets specific test requirements

Open DaveTeng0 opened this issue 2 years ago • 4 comments

What changes were proposed in this pull request?

In Cisco/Intel cluster, measure r/w performance when there is a very large of amount of metadata in rocksDB, & different amount of working sets whose size are larger than cache available.

  • Pure read
  • Pure write
  • Mixed workload, Read + Write

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7199

How was this patch tested?

Robot tests, manual tests in cluster.

DaveTeng0 avatar Sep 14 '22 18:09 DaveTeng0

cc. @kerneltime @jojochuang @umamaheswararao @duongkame

DaveTeng0 avatar Sep 14 '22 18:09 DaveTeng0

cc @duongkame

kerneltime avatar Sep 20 '22 22:09 kerneltime

Yes! The command will potentially read some key which doesn't exist in the cluster. Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

DaveTeng0 avatar Sep 28 '22 01:09 DaveTeng0

Yes! The command will potentially read some key which doesn't exist in the cluster. Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

kerneltime avatar Oct 12 '22 05:10 kerneltime

Yes! The command will potentially read some key which doesn't exist in the cluster. Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

Thanks Ritesh for the context!! I'll create a separate jira regarding to this! This definitely make sense!!

DaveTeng0 avatar Oct 19 '22 21:10 DaveTeng0

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

agree, reading nonexistent keys is a valid test case and the tool should support it deterministically. To do that, it has to know (on its own) which key exists and which doesn't. Warp does that by having a pre-test phase in which warp creates a set of keys (10K or so) and keeps the created keys in memory for the real read test.

We can also do the same for this tool, by maintaining a set of known keys that can be initialized by a pretest phase and grows with the write test.

duongkame avatar Oct 19 '22 22:10 duongkame

There are failures in CI, please make sure CI pass before merge @kerneltime.

kaijchen avatar Oct 20 '22 02:10 kaijchen

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

adoroszlai avatar Oct 20 '22 02:10 adoroszlai

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

agree, reading nonexistent keys is a valid test case and the tool should support it deterministically. To do that, it has to know (on its own) which key exists and which doesn't. Warp does that by having a pre-test phase in which warp creates a set of keys (10K or so) and keeps the created keys in memory for the real read test.

We can also do the same for this tool, by maintaining a set of known keys that can be initialized by a pretest phase and grows with the write test.

Sure!! I'll take a look how Warp do it and create a jira for it!

DaveTeng0 avatar Oct 20 '22 03:10 DaveTeng0

There are failures in CI, please make sure CI pass before merge @kerneltime.

Sorry!! I'll take a look!!

DaveTeng0 avatar Oct 20 '22 03:10 DaveTeng0

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

Sorry!! I'll take a look!! thanks Attila!

DaveTeng0 avatar Oct 20 '22 03:10 DaveTeng0

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

Thank you @adoroszlai! I should have checked

kerneltime avatar Oct 20 '22 18:10 kerneltime