DeepSeek-V3 generator_model = AutoModelForCausalLM.from_pretrained('deepseek-ai/DeepSeek-R1', trust_remote_code=True) throws error in RAG model/产生错误

Hello你好，

我在local 如上 load pre-trained DeepSeek-R1模型时出现以下quantization type错误：

Traceback (most recent call last): File "/Users/macbook/Documents/code/py/aimodels/transformer/rag_cn_ds.py", line 38, in generator_model = AutoModelForCausalLM.from_pretrained('deepseek-ai/DeepSeek-R1', trust_remote_code=True) #, config=config1True, trust_remote_code=) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained return model_class.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3605, in from_pretrained config.quantization_config = AutoHfQuantizer.merge_quantization_configs( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.12/site-packages/transformers/quantizers/auto.py", line 181, in merge_quantization_configs quantization_config = AutoQuantizationConfig.from_dict(quantization_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.12/site-packages/transformers/quantizers/auto.py", line 105, in from_dict raise ValueError( ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'higgs', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet', 'vptq’]

在LOAD V3模型时也有类似的错误。我用的是anaconda3的python 3.12，transformers是4.48.0.

config = AutoConfig.from_pretrained('deepseek-ai/DeepSeek-R1', trust_remote_code=True)
print("config.quantization_config : ", config.quantization_config)
print("config : ", config)

的输出是：

config.quantization_config :  {'activation_scheme': 'dynamic', 'fmt': 'e4m3', 'quant_method': 'fp8', 'weight_block_size': [128, 128]}
config :  DeepseekV3Config {
  "_name_or_path": "deepseek-ai/DeepSeek-R1",
  "architectures": [
    "DeepseekV3ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "auto_map": {
    "AutoConfig": "deepseek-ai/DeepSeek-R1--configuration_deepseek.DeepseekV3Config",
    "AutoModel": "deepseek-ai/DeepSeek-R1--modeling_deepseek.DeepseekV3Model",
    "AutoModelForCausalLM": "deepseek-ai/DeepSeek-R1--modeling_deepseek.DeepseekV3ForCausalLM"
  },
  "aux_loss_alpha": 0.001,
  "bos_token_id": 0,
  "eos_token_id": 1,
  "ep_size": 1,
  "first_k_dense_replace": 3,
  "hidden_act": "silu",
  "hidden_size": 7168,
  "initializer_range": 0.02,
  "intermediate_size": 18432,
  "kv_lora_rank": 512,
  "max_position_embeddings": 163840,
  "model_type": "deepseek_v3",
  "moe_intermediate_size": 2048,
  "moe_layer_freq": 1,
  "n_group": 8,
  "n_routed_experts": 256,
  "n_shared_experts": 1,
  "norm_topk_prob": true,
  "num_attention_heads": 128,
  "num_experts_per_tok": 8,
  "num_hidden_layers": 61,
  "num_key_value_heads": 128,
  "num_nextn_predict_layers": 1,
  "pretraining_tp": 1,
  "q_lora_rank": 1536,
  "qk_nope_head_dim": 128,
  "qk_rope_head_dim": 64,
  "quantization_config": {
    "activation_scheme": "dynamic",
    "fmt": "e4m3",
    "quant_method": "fp8",
    "weight_block_size": [
      128,
      128
    ]
  },
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "beta_fast": 32,
    "beta_slow": 1,
    "factor": 40,
    "mscale": 1.0,
    "mscale_all_dim": 1.0,
    "original_max_position_embeddings": 4096,
    "type": "yarn"
  },
  "rope_theta": 10000,
  "routed_scaling_factor": 2.5,
  "scoring_func": "sigmoid",
  "seq_aux": true,
  "tie_word_embeddings": false,
  "topk_group": 4,
  "topk_method": "noaux_tc",
  "torch_dtype": "bfloat16",
  "transformers_version": "4.48.0",
  "use_cache": true,
  "v_head_dim": 128,
  "vocab_size": 129280
}

请问如何消除这些load错误？如果有相关文件也请推荐。谢谢

Jan 24 '25 22:01 emclab

Same error : ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'higgs', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet', 'vptq']

Any help will be much appreciated?

Jan 26 '25 02:01 aswinaus

Same error

Jan 30 '25 15:01 qlj215

Based on another thread I was able to get it working without quantization though. You can check it here https://github.com/aswinaus/RAG/blob/main/RAG_DeepSeekR1.ipynb

Jan 30 '25 16:01 aswinaus

Hi @aswinaus , Thanks for sharing. I read through the link in your comment. Is the loading def below the trick? Any more detail is appreciated.

Also the model loaded is a 7B variant which is not a 671B version. Is it possible to load the full version of r1 with 671B params?

def load_model_with_quantization_fallback(
    model_name: str = "deepseek-ai/deepseek-llm-7b-chat",
    trust_remote_code: bool = True,
    **kwargs
) -> Tuple[PreTrainedModel, PreTrainedTokenizerBase]:

  try:
      model = AutoModel.from_pretrained(
          model_name,
          trust_remote_code=trust_remote_code,
          device_map=device_map,
          **kwargs
      )
      tokenizer = AutoTokenizer.from_pretrained(model_name)
      print("Model loaded successfully with original configuration")
      return model, tokenizer
  except ValueError as e:
      if "Unknown quantization type" in str(e):
          print(
              "Quantization type not supported directly. "
              "Attempting to load without quantization..."
          )

          config = AutoConfig.from_pretrained(
              model_name,
              trust_remote_code=trust_remote_code
          )
          if hasattr(config, "quantization_config"):
              delattr(config, "quantization_config")

          try:
              model = AutoModel.from_pretrained(
                  model_name,
                  config=config,
                  trust_remote_code=trust_remote_code,
                  device_map=device_map,
                  **kwargs
              )
              tokenizer = AutoTokenizer.from_pretrained(
                  model_name,
                  trust_remote_code=trust_remote_code
              )
              print("Model loaded successfully without quantization")
              return model, tokenizer

          except Exception as inner_e:
              print(f"Failed to load model without quantization: {str(inner_e)}")
              raise
      else:
          print(f"Unexpected error during model loading: {str(e)}")
          raise

Feb 02 '25 19:02 emclab

Yes it worked for the 7B for me and for 671B will work as long as you have the necessary memory and processor. And remember you need to update the model name in the second line in def above with the 671B model name.

Feb 02 '25 22:02 aswinaus

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you believe this issue is still relevant, please leave a comment to keep it open. Thank you for your contributions!

Mar 05 '25 00:03 github-actions[bot]

If @emclab was able to resolve the issue then it can be closed.

On Tue, Mar 4, 2025, 7:17 PM github-actions[bot] @.***> wrote:

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you believe this issue is still relevant, please leave a comment to keep it open. Thank you for your contributions!

— Reply to this email directly, view it on GitHub https://github.com/deepseek-ai/DeepSeek-V3/issues/335#issuecomment-2699326465, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCQQQJDEAAGIB3MS6GSJ5D2SY7BPAVCNFSM6AAAAABV2TUD7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOJZGMZDMNBWGU . You are receiving this because you were mentioned.Message ID: @.***> [image: github-actions[bot]]github-actions[bot] left a comment (deepseek-ai/DeepSeek-V3#335) https://github.com/deepseek-ai/DeepSeek-V3/issues/335#issuecomment-2699326465

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you believe this issue is still relevant, please leave a comment to keep it open. Thank you for your contributions!

— Reply to this email directly, view it on GitHub https://github.com/deepseek-ai/DeepSeek-V3/issues/335#issuecomment-2699326465, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCQQQJDEAAGIB3MS6GSJ5D2SY7BPAVCNFSM6AAAAABV2TUD7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOJZGMZDMNBWGU . You are receiving this because you were mentioned.Message ID: @.***>

Mar 05 '25 00:03 aswinaus

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you believe this issue is still relevant, please leave a comment to keep it open. Thank you for your contributions!

Apr 12 '25 00:04 github-actions[bot]

false

Apr 29 '25 00:04 github-actions[bot]

DeepSeek-V3 DeepSeek-V3 copied to clipboard

generator_model = AutoModelForCausalLM.from_pretrained('deepseek-ai/DeepSeek-R1', trust_remote_code=True) throws error in RAG model/产生错误

DeepSeek-V3
DeepSeek-V3 copied to clipboard