torchchat TorchChat is slower than gpt-fast

TorchChat is slower than gpt-fast

Open malfet opened this issue 9 months ago • 3 comments

Using torch==2.4.0.dev20240502 on Apple M2 pro I get following numbers for stories110M + float16 dtype

application	speed (eager)	speed (compile)
gpt-fast	176 tokens/sec	99 tokens/sec
torchchat	76 tokens/sec	33 tokens/sec

Commands to reproduce:

% python3 -mpip install --pre torch==2.4.0.dev20240502 --index-url https://download.pytorch.org/whl/nightly/cpu
% git clone https://github.com/pytorch-labs/gpt-fast -b malfet/set-prec-to-float16
% cd gpt-fast
 % python3 generate.py --checkpoint_path ~/git/pytorch/torchchat/.model-artifacts/stories110M/stories110M.pt

and for torchchat

% python3 torchchat.py generate stories110M --dtype float16 --device cpu

May 03 '24 16:05 malfet

torchchat torchchat copied to clipboard

TorchChat is slower than gpt-fast

torchchat
torchchat copied to clipboard