unsloth [Fixing] Better vision model finetuning

[Fixing] Better vision model finetuning

Open danielhanchen opened this issue 9 months ago • 2 comments

[ ] Allow mixing text and vision data (ie rows of data without any images
[ ] Resizing images automatically by checking the preprocessor, since memory usage can explode on large images. See https://github.com/unslothai/unsloth/issues/1524#issuecomment-2584971126
[ ] Allow saving to GGUF for Llava type models
[ ] Unsure exporting to 16bit does not miss any files for eg https://github.com/unslothai/unsloth/issues/1521
[ ] train_on_responses_only for VLMs

Jan 19 '25 11:01 danielhanchen