airllm
airllm copied to clipboard
Does AirLLM Support Running Quantized Models (e.g., unsloth/Qwen2-72B-bnb-4bit)?
Does AirLLM currently support running 4-bit quantized models like unsloth/Qwen2-72B-bnb-4bit? I’m trying to load and run this model using AirLLM, but I’m encountering the following error during generation:
RuntimeError: Attempted to call
variable.set_data(tensor), butvariableandtensorhave incompatible tensor type.
Other than that, I also tried using the smaller version of Qwen, for example: Qwen/Qwen2.5-0.5B, but I encountered this error.
AssertionError: model.safetensors.index.json should exist
i tried running with Qwen-72B-instruct and this is the error i got