arctic icon indicating copy to clipboard operation
arctic copied to clipboard

Standardize benchmark code of arctic

Open dimosped opened this issue 7 years ago • 0 comments

We currently we have multiple unrelated benchmarks for various scenarios:

  • generic Arctic top level calls
  • draft Arctic breakdown solution for keeping track of where time goes ((de)compress, numpy, serialization, MongoDB IO)
  • draft Arrow serialization benchmarks

The goal is to create a standard API for benchmarks:

  • requirements

    • specify experiement scenarios in an easy way (e.g. DSL or just a dict for fixed steps)
    • collection of results
    • plotting
    • break down to components (e.g. compress, numpy object creation, serialization, mongo IO)
    • make sure that when benchmark mode is disabled no impact on performance
    • reproducible benchmarks
  • goals

    • understand our code's bottlenecks
    • have a standard way to perform and repeat benchmarks

A skeleton of benchmarks exists in the top level directory, benchmarks. There are some very basic examples and a readme (https://github.com/manahl/arctic/blob/master/benchmarks.md), but these should be expanded upon to include all the storage engines and some more involved use cases and examples (i.e. chunkstore with numerics only, vs chunkstore with strings, version store with pickled objects, etc).

dimosped avatar Apr 20 '18 13:04 dimosped