Ammar Ahmad Awan
Ammar Ahmad Awan
This is a follow-up PR to the existing one: https://github.com/microsoft/DeepSpeed/pull/2127 The goal is to keep this PR open and investigate the issue in more detail while the PR above removes...
UPDATE: We don't need to wait for the INT4 PR to get merged. Let's do this cleanup of unused quant vars. and flags independently. > @lekurile -- please work with...
When I modified “run_example.sh” and changed backend to vllm, I got the error message down below, I will do some some check whether the error comes from server side or...