InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

Dataset Release for InternVL2.5 Training

Open amitbcp opened this issue 10 months ago • 0 comments

Motivation

Hi Team,

Thanks for the great effort and open-sourcing the MLLM and report. It was a great read to understand. One key question I wanted to ask was on lines of the dataset mixture release to train the model ?

As based on the data filtering mentioned in the paper, it might be very expensive to filter the datasets to create the training mixture. Do you plan to release the same ?

Related resources

No response

Additional context

No response

amitbcp avatar Feb 27 '25 00:02 amitbcp