Efficient-Large-Language-Model

Results 16 comments of Efficient-Large-Language-Model
trafficstars

Could you try VILA1.5 and let us know if there is still issue?

why do you disable #from tf_utils import flatten, shape_list ?

Will verify and fix. BTW, you need to use `--conv-mode=llama_3` w/ llama3 model.

It seems when using the correct conv mode, there is no issue. Therefore, no code change is needed.

For VILA1.5-40B, you should use `--conv-mode hermes-2`

Yes, the model is trained w/ those conv_mode, in theory, we should bake this parameter into the model config and don't let user change it.

Lora training is not well supported. I would recommend doing a regular finetuning.

Sorry, we merged a PR yesterday and it was problematic. We just rolled back. Could you pull and try again?

Are you using llama3? If so, you need to pass `--conv-mode=llama_3`