WebGPT
WebGPT copied to clipboard
add sample script for int8-gemm
Don't have time to add it to your systems in place, but this 3.5x the FLOPs for a very skinny matmul (cached KV inference) and should 4x decrease the model checkpoint size. Need to change it a bit more to add better absmax calculation (probably vectorwise instead of the obviously unoptimal global) but the MAE is very reasonable for the setup shown.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
| Name | Status | Preview | Comments | Updated (UTC) |
|---|---|---|---|---|
| web-gpt | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | May 1, 2023 5:25pm |
Sweet! What's with the change to params_gpt?