ImageBind icon indicating copy to clipboard operation
ImageBind copied to clipboard

selective modality finetune

Open JianbangZ opened this issue 2 years ago • 1 comments

Thanks for the awesome work! I wonder if I have my own audio-text dataset available for example, and want to just finetune the audio-text modality, how can I achieve it?

JianbangZ avatar May 10 '23 14:05 JianbangZ

I created a simple ImageBind finetuning example using LoRA: https://github.com/fabawi/ImageBind-LoRA

Make sure you clone it recursively to include the example dataset: git clone --recurse-submodules -j8 [email protected]:fabawi/ImageBind-LoRA.git

Install the requirements following the instructions provided in this repo, and run train.py

This should log your checkpoints, as well as separate LoRA if you'd like to update the original model without saving all the model params. More examples and finer control to be added soon

Selective fine-tuning is also possible. Checkout https://github.com/fabawi/ImageBind-LoRA/blob/09427ff4bcff2ef20a350cfea5aec3ca11a09af7/train.py#L220 . For now you can manually modify lora_modality_names and lora_layer_idxs to specify which lora layers get finetuned

fabawi avatar May 13 '23 12:05 fabawi