AlpinDale
AlpinDale
It's definitely a planned feature. I believe @sgsdxzy wanted to work on it.
I know what's happening. Will fix soon.
Please remove the `--quantization awq` part and try again.
As of v0.6.0, --load-in-{4bit,8bit,smooth} args are removed. Please use `-q fp8` instead.
It's mostly due to the QuIP# kernels. I'll look into extending support to P100s (we used to support them before) tomorrow.
Please check #444. It builds for sm_60, but I haven't tested if it actually runs.
@online2311 we forgot to bump the build architectures in the dockerfile, this will be fixed by the next release. If you want to build it yourself, edit the Dockerfile like...
Sorry I totally missed this! I'll take care of this soon.
This seems unrelated to aphrodite. Could be a host/port issue?
Try disabling CUDA graphs `--enforce-eager`, should help.