Baichuan-13B 加载模型报错 get_input_embeddings NotImplementedError

/root/anaconda3/envs/lmflow_v3/lib/python3.9/site-packages/lmflow-0.0.1-py3.9.egg/lmflow/models/ │ │ hf_decoder_model.py:228 in init │ │ │ │ 225 │ │ │ # We resize the embeddings only when necessary to avoid index errors. │ │ 226 │ │ │ # If you are creating a model from scratch on a small vocab and want a │ │ 227 │ │ │ # smaller embedding size, remove this test. │ │ ❱ 228 │ │ │ embedding_size = model.get_input_embeddings().weight.shape[0] │ │ 229 │ │ │ if len(tokenizer) > embedding_size: │ │ 230 │ │ │ │ model.resize_token_embeddings(len(tokenizer)) │ │ 231 │ │ │ │ /root/anaconda3/envs/lmflow_v3/lib/python3.9/site-packages/transformers/modeling_utils.py:1192 │ │ in get_input_embeddings │ │ │ │ 1189 │ │ base_model = getattr(self, self.base_model_prefix, self) │ │ 1190 │ │ print("debug", base_model, self.base_model_prefix, self) │ │ 1191 │ │ if base_model is not self: │ │ ❱ 1192 │ │ │ return base_model.get_input_embeddings() │ │ 1193 │ │ else: │ │ 1194 │ │ │ raise NotImplementedError │ │ 1195 │ │ │ │ /root/anaconda3/envs/lmflow_v3/lib/python3.9/site-packages/transformers/modeling_utils.py:1194 │ │ in get_input_embeddings │ │ │ │ 1191 │ │ if base_model is not self: │ │ 1192 │ │ │ return base_model.get_input_embeddings() │ │ 1193 │ │ else: │ │ ❱ 1194 │ │ │ raise NotImplementedError │ │ 1195 │ │ │ 1196 │ def set_input_embeddings(self, value: nn.Module): │ │ 1197 │ │ """ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ NotImplementedError

Jul 13 '23 10:07 delltower

debug BaichuanModel( (embed_tokens): Embedding(64000, 5120, padding_idx=0) (layers): ModuleList( (0-39): 40 x BaichuanLayer( (self_attn): BaichuanAttention( (W_pack): MergedLinear( in_features=5120, out_features=15360, bias=False (lora_dropout): Dropout(p=0.1, inplace=False) (lora_A): Linear(in_features=5120, out_features=16, bias=False) (lora_B): Conv1d(16, 10240, kernel_size=(1,), stride=(1,), groups=2, bias=False) ) (o_proj): Linear(in_features=5120, out_features=5120, bias=False) ) (mlp): MLP( (gate_proj): Linear(in_features=5120, out_features=13696, bias=False) (down_proj): Linear(in_features=13696, out_features=5120, bias=False) (up_proj): Linear(in_features=5120, out_features=13696, bias=False) (act_fn): SiLUActivation() ) (input_layernorm): RMSNorm() (post_attention_layernorm): RMSNorm() ) ) (norm): RMSNorm() ) model BaichuanForCausalLM( (model): BaichuanModel( (embed_tokens): Embedding(64000, 5120, padding_idx=0) (layers): ModuleList( (0-39): 40 x BaichuanLayer( (self_attn): BaichuanAttention( (W_pack): MergedLinear( in_features=5120, out_features=15360, bias=False (lora_dropout): Dropout(p=0.1, inplace=False) (lora_A): Linear(in_features=5120, out_features=16, bias=False) (lora_B): Conv1d(16, 10240, kernel_size=(1,), stride=(1,), groups=2, bias=False) ) (o_proj): Linear(in_features=5120, out_features=5120, bias=False) ) (mlp): MLP( (gate_proj): Linear(in_features=5120, out_features=13696, bias=False) (down_proj): Linear(in_features=13696, out_features=5120, bias=False) (up_proj): Linear(in_features=5120, out_features=13696, bias=False) (act_fn): SiLUActivation() ) (input_layernorm): RMSNorm() (post_attention_layernorm): RMSNorm() ) ) (norm): RMSNorm() ) (lm_head): Linear(in_features=5120, out_features=64000, bias=False) )

Jul 13 '23 10:07 delltower

pull hugging face的最新代码试试

Jul 14 '23 05:07 GradientGuru

Baichuan-13B Baichuan-13B copied to clipboard

加载模型报错 get_input_embeddings NotImplementedError

Baichuan-13B
Baichuan-13B copied to clipboard