ozone HDDS-7199. Implement new mix workload Read/Write Freon command which meets specific test requirements

What changes were proposed in this pull request?

In Cisco/Intel cluster, measure r/w performance when there is a very large of amount of metadata in rocksDB, & different amount of working sets whose size are larger than cache available.

Pure read
Pure write
Mixed workload, Read + Write

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7199

How was this patch tested?

Robot tests, manual tests in cluster.

Sep 14 '22 18:09 DaveTeng0

cc. @kerneltime @jojochuang @umamaheswararao @duongkame

Sep 14 '22 18:09 DaveTeng0

cc @duongkame

Sep 20 '22 22:09 kerneltime

Yes! The command will potentially read some key which doesn't exist in the cluster. Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

Sep 28 '22 01:09 DaveTeng0

Yes! The command will potentially read some key which doesn't exist in the cluster. Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

Oct 12 '22 05:10 kerneltime

Yes! The command will potentially read some key which doesn't exist in the cluster. Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

Thanks Ritesh for the context!! I'll create a separate jira regarding to this! This definitely make sense!!

Oct 19 '22 21:10 DaveTeng0

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

agree, reading nonexistent keys is a valid test case and the tool should support it deterministically. To do that, it has to know (on its own) which key exists and which doesn't. Warp does that by having a pre-test phase in which warp creates a set of keys (10K or so) and keeps the created keys in memory for the real read test.

We can also do the same for this tool, by maintaining a set of known keys that can be initialized by a pretest phase and grows with the write test.

Oct 19 '22 22:10 duongkame

There are failures in CI, please make sure CI pass before merge @kerneltime.

Oct 20 '22 02:10 kaijchen

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

Oct 20 '22 02:10 adoroszlai

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

agree, reading nonexistent keys is a valid test case and the tool should support it deterministically. To do that, it has to know (on its own) which key exists and which doesn't. Warp does that by having a pre-test phase in which warp creates a set of keys (10K or so) and keeps the created keys in memory for the real read test.

We can also do the same for this tool, by maintaining a set of known keys that can be initialized by a pretest phase and grows with the write test.

Sure!! I'll take a look how Warp do it and create a jira for it!

Oct 20 '22 03:10 DaveTeng0

There are failures in CI, please make sure CI pass before merge @kerneltime.

Sorry!! I'll take a look!!

Oct 20 '22 03:10 DaveTeng0

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

Sorry!! I'll take a look!! thanks Attila!

Oct 20 '22 03:10 DaveTeng0

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

Thank you @adoroszlai! I should have checked

Oct 20 '22 18:10 kerneltime

ozone ozone copied to clipboard

HDDS-7199. Implement new mix workload Read/Write Freon command which meets specific test requirements

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

ozone
ozone copied to clipboard