torchchat
torchchat copied to clipboard
TorchChat is slower than gpt-fast
Using torch==2.4.0.dev20240502
on Apple M2 pro I get following numbers for stories110M + float16 dtype
application | speed (eager) | speed (compile) |
---|---|---|
gpt-fast | 176 tokens/sec | 99 tokens/sec |
torchchat | 76 tokens/sec | 33 tokens/sec |
Commands to reproduce:
% python3 -mpip install --pre torch==2.4.0.dev20240502 --index-url https://download.pytorch.org/whl/nightly/cpu
% git clone https://github.com/pytorch-labs/gpt-fast -b malfet/set-prec-to-float16
% cd gpt-fast
% python3 generate.py --checkpoint_path ~/git/pytorch/torchchat/.model-artifacts/stories110M/stories110M.pt
and for torchchat
% python3 torchchat.py generate stories110M --dtype float16 --device cpu