swift
swift copied to clipboard
Ability to support mixture of text-only and image/text conversations for llava1.6 finetuning
Describe the feature The original implementation does support and there is need to fine-tune the LLM only for some textual knowledge sometimes. Right now the repo says it only support conversations with inputs.
Paste any useful information https://github.com/haotian-liu/LLaVA