benchmark
benchmark copied to clipboard
Llava model is not running its vision module
In order for that to happen, an additional input pixel_values
must be provided. This is different from standard HuggingFaceModel
.