Atomheart-Father
Atomheart-Father
I posted this problem in hugging-quants discussion page, they recommend me to open an issue here. https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4/discussions/13
> Try to use the device_map argument when creating the model. Recently, HF made some changes to loading that is causing this issue I only have 8 A800, which is...
> I would encourage you to look into how you can effectively use `accelerate` since AutoAWQ relies on this library to load `transformers` models. Specifically, you can design `device_map` for...
I wanted to inspect my current device_map, and tried to manually specifying the loadding. When I tried `model = AutoAWQForCausalLM.from_pretrained( model_path, low_cpu_mem_usage=True, use_cache=False, device_map='cpu' ) print(model.model.hf_device_map)` It returned `Loading checkpoint...