AutoAWQ
AutoAWQ copied to clipboard
didn't support ZeRO3?
Hi, I tried to load quantized awq model with deepspeed zero3. I met the following error:
File "/workspace/code/utils.py", line 61, in create_and_prepare_model
model = AutoAWQForCausalLM.from_quantized(
File "/usr/local/lib/python3.10/dist-packages/awq/models/auto.py", line 95, in from_quantized
return AWQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantized(
File "/usr/local/lib/python3.10/dist-packages/awq/models/base.py", line 410, in from_quantized
model = target_cls.from_config(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 437, in from_config
return model_class._from_config(config, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 1318, in _from_config
model = cls(config, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 503, in wrapper
f(module, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 1136, in __init__
self.model = LlamaModel(config)
File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 503, in wrapper
f(module, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 940, in __init__
self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx)
File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 513, in wrapper
self._post_init_method(module)
File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 1051, in _post_init_method
param.data = param.data.to(self.local_device)
NotImplementedError: Cannot copy out of meta tensor; no data!
the code is:
model = AutoAWQForCausalLM.from_quantized(
args.model_name_or_path,
max_seq_len=data_args.max_seq_length,
fuse_layers=False,
trust_remote_code=True,
low_cpu_mem_usage=False,
)
and the deepspeed config is:
compute_environment: LOCAL_MACHINE
debug: false
deepspeed_config:
deepspeed_multinode_launcher: standard
offload_optimizer_device: none
offload_param_device: none
zero3_init_flag: true
zero3_save_16bit_model: true
zero_stage: 3
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
If zero3 is supported with autoAWQ?
DeepSpeed is not supported with AutoAWQ. We use accelerate.
could you please share me the accelerate config? What kind of parallelism are you using? only DP?