ImageBind icon indicating copy to clipboard operation
ImageBind copied to clipboard

How could I train ImageBind completely from scratch?

Open ChloeL19 opened this issue 2 years ago • 4 comments
trafficstars

Thank you for the very cool work! I'm having trouble finding your implementation of NCE loss, however. I know @fabawi has implemented a version of this for his LoRA fine-tuning version (kudos). However, if I wanted to train the original ImageBind model completely from scratch how would I do this?

ChloeL19 avatar May 13 '23 23:05 ChloeL19

you can train the model without LoRA using ImageBind-LoRA. Simply remove the --lora argument when calling train.py and set --full_model_checkpointing. I don't have the resources to fine-tune it but it should work in theory. Try it out on the toy dataset provided in the repo, and later you can try to implement dataloaders for the original datasets.

On a side note, LoRA adaptation was shown to outperform full model fine-tuning in many instances and for different applications --- this is not specific to ImageBind-LoRA but a general statement about low-rank adaptation. If you would like to see the outcomes of fine-tuning on LoRA, check out the example. The weights are in the repo itself. Just install the requirements and follow the instructions in the README.

fabawi avatar May 17 '23 09:05 fabawi

Hi @ChloeL19 , May I ask if you have been able to train the model from scratch?

Wolfwjs avatar Jun 06 '23 08:06 Wolfwjs

I want to know how to train and how to apply the model after training is complete? How to present the results?

wendellgithub0206 avatar Jul 01 '23 07:07 wendellgithub0206

you can train the model without LoRA using ImageBind-LoRA. Simply remove the --lora argument when calling train.py and set --full_model_checkpointing. I don't have the resources to fine-tune it but it should work in theory. Try it out on the toy dataset provided in the repo, and later you can try to implement dataloaders for the original datasets.

On a side note, LoRA adaptation was shown to outperform full model fine-tuning in many instances and for different applications --- this is not specific to ImageBind-LoRA but a general statement about low-rank adaptation. If you would like to see the outcomes of fine-tuning on LoRA, check out the example. The weights are in the repo itself. Just install the requirements and follow the instructions in the README.

Why use LoRA? Did ImageBind paper use it?

K-M-Ibrahim-Khalilullah avatar Jul 21 '23 17:07 K-M-Ibrahim-Khalilullah