torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

TorchChat is slower than gpt-fast

Open malfet opened this issue 9 months ago • 3 comments

Using torch==2.4.0.dev20240502 on Apple M2 pro I get following numbers for stories110M + float16 dtype

application speed (eager) speed (compile)
gpt-fast 176 tokens/sec 99 tokens/sec
torchchat 76 tokens/sec 33 tokens/sec

Commands to reproduce:

% python3 -mpip install --pre torch==2.4.0.dev20240502 --index-url https://download.pytorch.org/whl/nightly/cpu
% git clone https://github.com/pytorch-labs/gpt-fast -b malfet/set-prec-to-float16
% cd gpt-fast
 % python3 generate.py --checkpoint_path ~/git/pytorch/torchchat/.model-artifacts/stories110M/stories110M.pt 

and for torchchat

% python3 torchchat.py generate stories110M --dtype float16 --device cpu

malfet avatar May 03 '24 16:05 malfet