InternVL
InternVL copied to clipboard
Pre-Training & SFT datasets
Thank you for your excellent work—InternVL3.5! Will the dataset you used during Pre-Training and SFT phase be made public? In the technical report, you mentioned that some additional data was included both in dataset. How were these composed or selected? I’d like to know these details to replicate the training pipeline.
Thank you for sharing this excellent work. I am very interested in the dataset you mentioned. Would it be possible for the authors to provide access to this dataset, or share instructions on how to obtain it?