Error when using the OpenELM as the LLM backbone
Hey, TinyLLaVA Factory is great work, and I sincerely thank you for ur sharing. However, I meet some problems when I use OpenELM as the backbone.(TinyLLaMA as the backbone works fun in my environment)
The problem is as
"do you wish to run the custom code y/n" and "trust_remote_code=true"
However, I see it has been set here: configuration_tiny_llava.py line:101
I have no idea about why it happens, so I start this issue. Thanks in advance!
Thank you for your question,Can I take a look at which script you were running when this problem occurred?
Thank you for your question,Can I take a look at which script you were running when this problem occurred?
Thanks for your quick reply! The script is "train_tinyllama.sh" Here is the detail:
DATA_PATH=/tiny_llava_datasets/text_files/blip_laion_cc_sbu_558k.json #pretrain annotation file path FINETUNE_DATA_PATH=/tiny_llava_datasets/text_files/llava_v1_5_mix665k.json #finetune annotation file path IMAGE_PATH=/tiny_llava_datasets/llava/llava_pretrain/images #pretrain image dir FINETUNE_IMAGE_PATH=/data/tiny_llava_datasets #finetune image dir
LLM_VERSION=/model/OpenELM-450M-Instruct # llm path in huggingface VT_VERSION=/model/clip-vit-large-patch14-336 #vision tower path in huggingface VT_VERSION2="" #if you are not using mof vision tower, keep it empty CN_VERSION=mlp2x_gelu #connector type, other options are: qformer, resampler, etc CONV_VERSION=llama #chat template, other options are: phi, llama, gemmma, etc VERSION=base #experiment name for recording different runnings TRAIN_RECIPE=common #training recipes, other options are: lora, qlora MODEL_MAX_LENGTH=2048 #max model length for llm
bash scripts/train/pretrain.sh "$DATA_PATH" "$IMAGE_PATH" "$LLM_VERSION" "$VT_VERSION" "$VT_VERSION2" "$CN_VERSION" "$VERSION" "$TRAIN_RECIPE" "$MODEL_MAX_LENGTH" bash scripts/train/finetune.sh "$FINETUNE_DATA_PATH" "$FINETUNE_IMAGE_PATH" "$LLM_VERSION" "$VT_VERSION" "$VT_VERSION2" "$CN_VERSION" "$CONV_VERSION" "$VERSION" "$TRAIN_RECIPE" "$MODEL_MAX_LENGTH"
and the raised error is:
My OpenELM-450M folder contains(I wonder will the python files matter?):
And my transformers version is: 4.39.3
I did not change any other things apart from the "--attn_implementation eager" in pretrain.sh due to my device is V100, which does not support the flashattention.
Sincerely thanks!
Hi, we have provided a specific script for training OpenELM. Please try that one: scripts/train/openelm/train_openelm.sh.
p.s. the pretraining and finetuning scripts for OpenELM are a bit different from scripts for tinyllama. So please try scripts/train/openelm/train_openelm.sh.
Hello, the tokenizer used by Openelm is the tokenizer from Llama, so you need to specify the tokenizer type in the script. However, the script for Tinyllama does not specify this, which leads to the error you mentioned. You can refer to scripts/train/openelm/pretrain_openelm.sh or scripts/train/openelm/finetune_openelm.sh for the specific method. We recommend that you directly use the scripts/train/openelm/train_openelm.sh script, which already has the basic experimental settings specified for you. You can use it directly.
Hello, the tokenizer used by Openelm is the tokenizer from Llama, so you need to specify the tokenizer type in the script. However, the script for Tinyllama does not specify this, which leads to the error you mentioned. You can refer to scripts/train/openelm/pretrain_openelm.sh or scripts/train/openelm/finetune_openelm.sh for the specific method. We recommend that you directly use the scripts/train/openelm/train_openelm.sh script, which already has the basic experimental settings specified for you. You can use it directly.
Hi, we have provided a specific script for training OpenELM. Please try that one: scripts/train/openelm/train_openelm.sh.
p.s. the pretraining and finetuning scripts for OpenELM are a bit different from scripts for tinyllama. So please try scripts/train/openelm/train_openelm.sh.
Sorry for the late reply. I try the train_openelm.sh and it works well. Thanks for your help! I will close this issue as it has been solved.