LLaVA
LLaVA copied to clipboard
[bug] It seems that LLaVA/llava/train/train.py has not been updated for v1.6
Describe the issue
For example, in the current LLaVA/llava/train/train.py, there is no image_grid_pinpoints parameter in the model_args, which will be used in llava/mm_utils.py, line176: image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints)
I guess possible_resolutions = [(2,2), (1,2), (2,1), (1,3), (3,1), (1,4), (4,1)] possible_resolutions = [(x * 336, y * 336) for x,y in possible_resolutions] image_grid_pinpoints = possible_resolutions
Is the code suggested by @jayyoung0802 meant for training all v1.6 models? Which model have you all found to be successfully training with this code?
The current code base does not support training v1.6 models yet. We'll release the training code and data with v1.6 models soon. Thanks.
I guess possible_resolutions = [(2,2), (1,2), (2,1), (1,3), (3,1), (1,4), (4,1)] possible_resolutions = [(x * 336, y * 336) for x,y in possible_resolutions] image_grid_pinpoints = possible_resolutions
I meet this problem again when reading the config of llava-1.6 models. It seems that not all of the mentioned resolutions are supported. Please refer to my newly raised question https://github.com/haotian-liu/LLaVA/issues/1644 for further information.