transformers-bloom-inference
transformers-bloom-inference copied to clipboard
Incorrectly benchmarking
All 3 scripts under bloom-inference-scripts incorrectly benchmark the t_generate_span time. The t_generate_span is got from the first generate() call at here https://github.com/huggingface/transformers-bloom-inference/blob/main/bloom-inference-scripts/bloom-ds-inference.py#L257 instead of in the benchmark cycle.