gloo
gloo copied to clipboard
reference benchmarking in InifiniBand/RoCE
Hi! Firstly, thanks for the nice work. It's good to see the brief benchmark figures in README.md.
It would be great if anybody can show the benchmarking result of --transport ibverbs
in the same or similar configuration(4 machines with a 40GbE network). Thanks
Thanks for the kind words!
I don't have access to 40G IB, only 100G IB. So the result wouldn't be an apples to apples comparison.
@pietern firstly also appreciate your great contribution to Gloo. just wondering if you have any plan and timeline to add the ibverbs support to caffe2, which should significant improve the performance.
Thanks. Andy
Do you have plan to create as TF benchmarks ? https://github.com/tensorflow/benchmarks To run an apples to apples comparison. I can run it in our lab with 25/40/100 Gb RDMA, RoCE, TCP.
@bkovalev These benchmarks have nothing to do with any outside framework. They are native gloo algorithm benchmarks. Also see the gloo/benchmark
directory for more info. You can build the benchmark tool by specifying -DBUILD_BENCHMARK=ON
to CMake.