gpt-fast icon indicating copy to clipboard operation
gpt-fast copied to clipboard

slight performance improving(ㄒoㄒ)

Open 480284856 opened this issue 1 year ago • 1 comments

I only got a little improvement than the native code. Was there any I missed?

Commands

cli 1: time python generate.py --compile --compile_prefill --checkpoint_path /root/gpt-fast/codellama-34b-python/model_int8.pth --prompt "def quicksort(arr):" --max_new_tokens 32 --num_samples 50

cli 2: time python generate.py --checkpoint_path /root/gpt-fast/codellama-34b-python/model_int8.pth --prompt "def quicksort(arr):" --max_new_tokens 32 --num_samples 50

Results

result of cli 1: 4.45tokens/sec & 151.52GB/s for bandwidth result of cli 2: 4.24tokens/sec & 144.55GB/s for bandwidth

relative improvement(compile vs not compile): speed: 4.9% memory bandwidth: 4.8%

Env

gpu: 1*L40S docker: python:3.9 pytorch installation: pip install torch

480284856 avatar Dec 14 '23 07:12 480284856

Are you using pytorch nightly? This perf seems much worse than I would expect

Chillee avatar Dec 15 '23 01:12 Chillee