InternVideo icon indicating copy to clipboard operation
InternVideo copied to clipboard

InternVideo2-Stage1-1B-224p-K400 missing processor/config for Hugging Face transformers

Open Dimlight opened this issue 3 months ago • 0 comments

Hello,

I am trying to use the Hugging Face model OpenGVLab/InternVideo2-Stage1-1B-224p-K400 with the transformers library for video feature extraction.

When I call:

from transformers import AutoImageProcessor
processor = AutoImageProcessor.from_pretrained("OpenGVLab/InternVideo2-Stage1-1B-224p-K400")

I get the error:

OSError: Can't load image processor for 'OpenGVLab/InternVideo2-Stage1-1B-224p-K400'.
... no preprocessor_config.json file

Looking at the repo, it only contains:

.gitattributes
1B_ft_k710_ft_k400_f16.pth
1B_ft_k710_ft_k400_f8.pth
README.md

There is no config.json or preprocessor_config.json.

This makes it incompatible with AutoImageProcessor / AutoVideoProcessor.

Request

Could you add the appropriate processor/config files (e.g. preprocessor_config.json, config.json) so the model can be loaded via transformers?

Or provide guidance on the recommended way to preprocess inputs for this model when using Hugging Face.

Thanks a lot for releasing this model!

Dimlight avatar Sep 23 '25 02:09 Dimlight