benchmark-wrapper icon indicating copy to clipboard operation
benchmark-wrapper copied to clipboard

Extend Common Benchmark Fields

Open whitleykeith opened this issue 4 years ago • 4 comments

Common Benchmark Fields

With @learnitall's PR merged we have a launch pad to start creating a well defined schema for our benchmarks. The biggest gap I've seen is around common fields within benchmark results.

This proposal is to create strictly defined fields across all benchmarks, regardless of benchmark/environment/config. This data should consist of:

  • benchmark-agnostic user defined fields (i.e. uuid/run_id)
  • benchmark-agnostic generated fields (i.e. start_time, end_time, duration, status)
  • benchmark-specific in-use but benchmark-agnostic in definition (i.e. iteration/sample)

This would look something like this:

@dataclass
class BenchmarkResult:
    name: str
    uuid: str
    start_time: datetime.datetime
    end_time: datetime.datetime
    duration: str 
    iteration: Optional[int]
    metadata: Dict[str, Any]
    config: Dict[str, Any]
    data: Dict[str, Any]
    labels: Dict[str, Any]
    tag: str

These fields should take precedence over any field defined with the same key in metadata/config/data/labels. For instance if benchmark creates a field called start_time in it's data dictionary, then it should be overwritten by the top-level start_time field.

Common logic for setting the fields

Subsequently, the Benchmark Abstract class should enforce the definition of these fields while providing users the ability to override if required.

Psuedo-code Example:

class Benchmark(ABC):
  @abstractmethod
  def get_start_time(self): 
     # common logic to create start_time here
  @abstractmethod
  def get_end_time(self): 
     # common logic to create end_time

  # This may requrie some changes in how the Benchmark class encapsulates a run
  def run(self):
    start_time = self.get_start_time()
    # run logic 
    end_time = self.get_end_time()
    

   

Ideally we can create a common run method that ensures generated fields like start_time and end_time get added, but it'll require tweaking how the Benchmark class works atm.

whitleykeith avatar Jul 14 '21 16:07 whitleykeith

This would be fantastic! It would also allow us to create really nice and specific templates for elasticsearch. One question that I have, which @mfleader talked to me about, was about the definition for config, metadata and labels. Right now we have the following:

  • config represents the configuration of the benchmark itself
  • metadata are fields common to all benchmarks, such as cluster_name and user
  • labels are any key-value pairs that a user has passed in during snafu invocation.

I think that the metadata field is a bit confusing. We might want to look at the current fields we are placing there and whether or not they should be added into BenchmarkResult as a field.

learnitall avatar Jul 14 '21 21:07 learnitall

yeah metadata is a bit weird. I'm fine if some of the fields I mentioned (start_time, end_time, run_id, etc.) are in that metadata class in the actual code as long as it gets flattened in the doc. cluster_name shouldn't be there though since it's environment specific

whitleykeith avatar Jul 15 '21 14:07 whitleykeith

For newly proposed benchmark wrappers, let's say I have a field in my BenchmarkResult.data with a name, like duration, that conflicts with a field name in BenchmarkResult. If it has a different definition than BenchmarkResult.duration (specifically it is a sum of round trip times), should I just give that data field a new name?

mfleader avatar Jul 21 '21 20:07 mfleader

How is start_time defined? Do they come from the beginning of the benchmarking tool or the beginning of the parent python process?

mfleader avatar Jul 21 '21 20:07 mfleader