pyperformance icon indicating copy to clipboard operation
pyperformance copied to clipboard

Reorganize tags on benchmarks

Open mdboom opened this issue 2 years ago • 2 comments

Once https://github.com/python/pyperformance/issues/208 is complete, we will probably want to reorganize the benchmarks into tags so that they are more useful and meaningful. This issue can hopefully provide a place for discussion.

Most importantly: Are there other places where the tags are used such that changing them would be an issue? How can I identify these folks other than posting here?

The currently assigned tags are:

apps: {'2to3', 'tornado_http', 'html5lib', 'chameleon'}
math: {'pidigits', 'float', 'nbody'}
regex: {'regex_v8', 'regex_compile', 'regex_effbot', 'regex_dna'}
serialize: {'xml_etree_generate', 'pickle_dict', 'json_dumps', 'unpickle_pure_python', 'unpickle', 'xml_etree_process', 'json_loads', 'pickle', 'xml_etree_parse', 'pickle_list', 'xml_etree_iterparse', 'unpickle_list', 'pickle_pure_python'}
startup: {'python_startup_no_site', 'python_startup'}
template: {'genshi_xml', 'mako', 'genshi_text', 'django_template'}

For the most part, I think the existing tags are fine, though apps is perhaps a little vague and perhaps should be removed.

I would propose adding the following tags (each benchmark can have multiple tags):

  • Size:
    • workload: This would be for benchmarks that represent real world workloads. These would roll up into "one big number" that we report in places like the CPython release notes. I'm not crazy about the name of this tag. Suggestions?
    • feature: The opposite of a macrobenchmark, for benchmarks that test a very specific feature.
  • Domain:
    • web: Typical tasks used in server-side web development: for example, serializing/deserializing HTML, JSON, XML, l10n and i18n related things

mdboom avatar May 24 '22 20:05 mdboom

To me, the alternative to "workload" is "macro". My personal preference is "workload" though. 😄

ericsnowcurrently avatar May 25 '22 17:05 ericsnowcurrently

It would also make sense to have a tag for the source of the benchmark, e.g. "pyperformance" (or "default"), "pyston".

ericsnowcurrently avatar May 25 '22 17:05 ericsnowcurrently