open_lm icon indicating copy to clipboard operation
open_lm copied to clipboard

Benchmark tok/sec with other libs

Open kernelmachine opened this issue 2 years ago • 4 comments
trafficstars

Would be great to benchmark tokens/sec of OpenLM, comparing to other libraries like Mosaic, Metaseq, etc.

kernelmachine avatar Sep 26 '23 20:09 kernelmachine

just a few datapoints from OpenLM, with default hparams: we get ~2.5K tokens/sec/GPU on 256 A100s for OpenLM-7B, and ~9.5K tokens/sec/GPU on 128 A100s for OpenLM-1B. ~11.5K tokens/sec/GPU on 32 A100s for 1B. 7B model gets ~2700 tokens/sec/GPU on one node

kernelmachine avatar Sep 26 '23 22:09 kernelmachine

This could also be useful: https://github.com/mosaicml/llm-foundry/tree/main/scripts/train/benchmarking

ludwigschmidt avatar Oct 09 '23 10:10 ludwigschmidt

@achalddave is making fantastic progress here. When we're done, let's add the benchmarking results to the repository so others can compare numbers and check whether they locally get the expected performance.

ludwigschmidt avatar Oct 11 '23 09:10 ludwigschmidt

See #29 for improvements. Basically we are now matching the mosaicml numbers on 1 node (~4000 tok/s/gpu for a 7b model with batch size 16, seq length 2048, on 8 A100). Will close this once we test convergence on a large run.

achalddave avatar Oct 13 '23 03:10 achalddave