annopackage
annopackage
Hi, haotian, thanks for your great work. Currently, I want to increase the model max length for larger image size. If we change the parameter, do we need to change...
**Describe the bug** A clear and concise description of what the bug is. bug occurs while calling dataloader with multi num workers. Here, ’trainer‘ is initialized from transformers. If I...
**Is your feature request related to a problem? Please describe.** Global steps for optimizer states would be save each save_steps, which consume lots of time and memory. **Describe the solution...
Hi, thanks for your great work and open source code. I want to reproduce the pre-training stage, but i do not find the implementation. Do you have any plan to...
Some mistakes about the annotations below: {"reason": [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], "file_name": "dc9226f0-e1cb0a60.jpg"}...
hi, thanks for your great work. I wonder if you have compare your small vit-300m-448px model with other clip models.
How did you unify the format of pretraining dataset? During supervised fine tuning stage, the training data are curated as question and answer pairs. For caption or detection dataset, I...
Hi, thanks for your great work. I was wondering the how many gpus are needed to training llava-next with 72b llm.