AutoAWQ icon indicating copy to clipboard operation
AutoAWQ copied to clipboard

didn't support ZeRO3?

Open ghost opened this issue 1 year ago • 2 comments

Hi, I tried to load quantized awq model with deepspeed zero3. I met the following error:

  File "/workspace/code/utils.py", line 61, in create_and_prepare_model
    model = AutoAWQForCausalLM.from_quantized(
  File "/usr/local/lib/python3.10/dist-packages/awq/models/auto.py", line 95, in from_quantized
    return AWQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantized(
  File "/usr/local/lib/python3.10/dist-packages/awq/models/base.py", line 410, in from_quantized
    model = target_cls.from_config(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 437, in from_config
    return model_class._from_config(config, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 1318, in _from_config
    model = cls(config, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 503, in wrapper
    f(module, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 1136, in __init__
    self.model = LlamaModel(config)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 503, in wrapper
    f(module, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 940, in __init__
    self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 513, in wrapper
    self._post_init_method(module)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 1051, in _post_init_method
    param.data = param.data.to(self.local_device)
NotImplementedError: Cannot copy out of meta tensor; no data!

the code is:

model = AutoAWQForCausalLM.from_quantized(
            args.model_name_or_path,
            max_seq_len=data_args.max_seq_length,
            fuse_layers=False,
            trust_remote_code=True,
            low_cpu_mem_usage=False,         
            )

and the deepspeed config is:

compute_environment: LOCAL_MACHINE                                                                                                                                           
debug: false
deepspeed_config:
  deepspeed_multinode_launcher: standard
  offload_optimizer_device: none
  offload_param_device: none
  zero3_init_flag: true
  zero3_save_16bit_model: true
  zero_stage: 3
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

If zero3 is supported with autoAWQ?

ghost avatar Apr 08 '24 10:04 ghost

DeepSpeed is not supported with AutoAWQ. We use accelerate.

casper-hansen avatar Apr 08 '24 10:04 casper-hansen

could you please share me the accelerate config? What kind of parallelism are you using? only DP?

ghost avatar Apr 08 '24 10:04 ghost