transformers icon indicating copy to clipboard operation
transformers copied to clipboard

error when convert llama1 ckpts to hf formath

Open mabingqi opened this issue 1 year ago • 5 comments
trafficstars

System Info

  • transformers version: 4.41.0.dev0
  • Platform: Linux-4.18.0-425.3.1.el8.x86_64-x86_64-with-glibc2.17
  • Python version: 3.9.12
  • Huggingface_hub version: 0.21.4
  • Safetensors version: 0.4.2
  • Accelerate version: 0.21.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.1+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

No response

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [x] My own task or dataset (give details below)

Reproduction

python transformers/models/llama/convert_llama_weights_to_hf.py --input_dir path-to-llama1-7b-source --model_size 7B --output_dir path-to-llama1-7b-target --llama_version 1

Expected behavior

error: RuntimeError: shape '[32, 2, 2, 4096]' is invalid for input of size 16777216 In line https://github.com/huggingface/transformers/blob/f26e4073707189c93915227779a4f6ea3c40d43b/src/transformers/models/llama/convert_llama_weights_to_hf.py#L181 the k_proj in llama1 7b is 4096 by 4096, but the dim1 here is 128. Maybe this is a bug when converting llama1 ckpt

mabingqi avatar May 09 '24 04:05 mabingqi

cc @ArthurZucker

amyeroberts avatar May 09 '24 08:05 amyeroberts

Same question!

ZHEGG avatar May 09 '24 09:05 ZHEGG

Haha that's annoying, we might have broken conversion for llama1 when adding llama3. Could you test on transformers==4.38 or 4.39?

ArthurZucker avatar May 09 '24 09:05 ArthurZucker

I encountered the same error when I was converting the Llama 2 model. Using transformers==4.38 solved this problem.

RPC2 avatar May 10 '24 07:05 RPC2

Yep, it's not expected. I'll open a PR to fix conversion on all models 🤗

ArthurZucker avatar May 10 '24 08:05 ArthurZucker

Any update on this issue, please.

yoryis avatar May 16 '24 18:05 yoryis

If you just need to convert the old Llama models, you can quickly revert to a previous version of transformers (prior to 4.39.0) 😉

ArthurZucker avatar May 20 '24 08:05 ArthurZucker

The problem is not solved by changing the transformer version. I'm trying to convert llama-2 and it gives me the same error

pep1t0 avatar Jun 08 '24 10:06 pep1t0

Sorry about that, I'll do this today!

ArthurZucker avatar Jun 11 '24 13:06 ArthurZucker

Hey @ArthurZucker. Any updates? I am running into the same issue when trying to convert llama-2 (and downgrading transformers to 4.38 also doesn't fix it).

AnaMVasilcoiu avatar Jun 13 '24 13:06 AnaMVasilcoiu

Hey! I am downloading everything now to fix it ! 🤗 there is #30734 as well

ArthurZucker avatar Jun 19 '24 07:06 ArthurZucker

Sorry for the delay!

ArthurZucker avatar Jun 25 '24 15:06 ArthurZucker

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jul 21 '24 08:07 github-actions[bot]