open_lm Benchmark tok/sec with other libs

Benchmark tok/sec with other libs

Open kernelmachine opened this issue 2 years ago • 4 comments

trafficstars

Would be great to benchmark tokens/sec of OpenLM, comparing to other libraries like Mosaic, Metaseq, etc.

Sep 26 '23 20:09 kernelmachine

just a few datapoints from OpenLM, with default hparams: we get ~2.5K tokens/sec/GPU on 256 A100s for OpenLM-7B, and ~9.5K tokens/sec/GPU on 128 A100s for OpenLM-1B. ~11.5K tokens/sec/GPU on 32 A100s for 1B. 7B model gets ~2700 tokens/sec/GPU on one node

Sep 26 '23 22:09 kernelmachine

This could also be useful: https://github.com/mosaicml/llm-foundry/tree/main/scripts/train/benchmarking

Oct 09 '23 10:10 ludwigschmidt

@achalddave is making fantastic progress here. When we're done, let's add the benchmarking results to the repository so others can compare numbers and check whether they locally get the expected performance.

Oct 11 '23 09:10 ludwigschmidt

See #29 for improvements. Basically we are now matching the mosaicml numbers on 1 node (~4000 tok/s/gpu for a 7b model with batch size 16, seq length 2048, on 8 A100). Will close this once we test convergence on a large run.

Oct 13 '23 03:10 achalddave

open_lm open_lm copied to clipboard

Benchmark tok/sec with other libs

open_lm
open_lm copied to clipboard