ray_beam_runner
ray_beam_runner copied to clipboard
A performance test for the Ray Beam portable runner
It would be great to have a micro benchmark, and a larger benchmark to measure our progress.
I think I can spend some time working on this. This can help me get familiar with our current code and running environment.
that would be great! We can track performance using this action: https://github.com/benchmark-action/github-action-benchmark
I don't think it needs to be very big. I think if it processes 1GB running locally with the current implementation, we may be able to get something that we can track and improve over time.
We have a few microbenchmarks in Beam that you could use as inspiration, but I don't think they're big enough to test our runner and optimize over time:
- https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tools/fn_api_runner_microbenchmark.py (e.g. instead of creating 1000 elements, we could add a source that outputs more data - ~1gb