ray_beam_runner icon indicating copy to clipboard operation
ray_beam_runner copied to clipboard

A performance test for the Ray Beam portable runner

Open pabloem opened this issue 3 years ago • 3 comments

It would be great to have a micro benchmark, and a larger benchmark to measure our progress.

pabloem avatar Jun 14 '22 16:06 pabloem

I think I can spend some time working on this. This can help me get familiar with our current code and running environment.

wilsonwang371 avatar Jun 29 '22 05:06 wilsonwang371

that would be great! We can track performance using this action: https://github.com/benchmark-action/github-action-benchmark

I don't think it needs to be very big. I think if it processes 1GB running locally with the current implementation, we may be able to get something that we can track and improve over time.

pabloem avatar Jun 29 '22 15:06 pabloem

We have a few microbenchmarks in Beam that you could use as inspiration, but I don't think they're big enough to test our runner and optimize over time:

  • https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tools/fn_api_runner_microbenchmark.py (e.g. instead of creating 1000 elements, we could add a source that outputs more data - ~1gb

pabloem avatar Jun 29 '22 21:06 pabloem