MMInstruct
MMInstruct copied to clipboard
The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity". The MMInstruct dataset includes 973K instructions from 24 domains...
Thanks for your good work! Can you provide some guidance on how to use your data generation pipeline?
Thank you for sharing a useful dataset. Your paper says 973K instruction data, but the dataset published on Huggingface (https://huggingface.co/datasets/yuecao0119/MMInstruct-GPT4V) seems to be different. I counted the number of id...
The images folder contains various types of .gif files. How should I handle them when tuning instructions?
请问json文件中的图片存在哪里
Some json files in the all_seed folder are empty (e.g. 00007_attribute_recognition.json). Please complete the empty json files.