Jaber

Results 16 comments of Jaber

Thank you for your question,Can I take a look at which script you were running when this problem occurred?

Hello, the tokenizer used by Openelm is the tokenizer from Llama, so you need to specify the tokenizer type in the script. However, the script for Tinyllama does not specify...

It seems you are using the tinyllava_bench branch version of the inference code. Please refer to the "Launch Demo Locally" section in the readme for the new version of the...

If you are using LoRA fine-tuning and the path string where you save the model does not include 'lora,' you can try renaming your model save path to add '_lora.'

When you were evaluation, was the model_path in load_pretrained_model set to "/scratch/riggi/Analysis/MLProjects/TinyLLaVA/fine-tuning/radioimg-dataset/TinyLLaVA-Phi-2-SigLIP-3.1B/vision_freeze/nepochs2/_lora"?

Could you please check the function that loads model parameters in the [codebase](https://github.com/TinyLLaVA/TinyLLaVA_Factory/blob/main/tinyllava/model/load_model.py#L39)? When loading model weights, it checks if the model path contains the string "lora" to load the...

There is a bug in our codebase. Please change [@register_connector('mof_mlp')](https://github.com/TinyLLaVA/TinyLLaVA_Factory/blob/main/tinyllava/model/connector/mof_mlp.py#L47) to @register_connector('mof') of the codebase, and then change CN_VERSION=mof_mlp to CN_VERSION=mof in your script.

Different models use different tokenizers, and when different tokenizers tokenize the text, the corresponding label positions are different.

You only need to modify the GPU configuration in the DeepSpeed launch scripts for pretraining and finetuning. For example, change deepspeed --include localhost:4,5,6,7 in [pretrain.sh](https://github.com/TinyLLaVA/TinyLLaVA_Factory/blob/main/scripts/train/pretrain.sh#L22) to deepspeed --include localhost:0,1,2,3,4,5,6,7.

Thanks for your reminder, TinyLLaVA-0.55B actually uses OpenELM-450M-Instruct as the LLM and clip-vit-base-patch16 as the VisionTower. The config.json file in the Huggingface repository is correct. I have updated the description...