Efficient-Large-Language-Model
Efficient-Large-Language-Model
Could you try VILA1.5 and let us know if there is still issue?
why do you disable #from tf_utils import flatten, shape_list ?
Will verify and fix. BTW, you need to use `--conv-mode=llama_3` w/ llama3 model.
It seems when using the correct conv mode, there is no issue. Therefore, no code change is needed.
For VILA1.5-40B, you should use `--conv-mode hermes-2`
Yes, the model is trained w/ those conv_mode, in theory, we should bake this parameter into the model config and don't let user change it.
Lora training is not well supported. I would recommend doing a regular finetuning.
Sorry, we merged a PR yesterday and it was problematic. We just rolled back. Could you pull and try again?
Are you using llama3? If so, you need to pass `--conv-mode=llama_3`
Yes, will do.