LAVIS
LAVIS copied to clipboard
Questions to reproduce BLIP 2 examples
Hi. I'm trying to use your colab.
I'm trying the most powerful model (the default in the colab):
model, vis_processors, _ = load_model_and_preprocess(
name="blip2_t5", model_type="pretrain_flant5xxl", is_eval=True, device=device
)
I'm trying to reproduce two examples in the question answering way you have in the colab:
ans = model.generate({"image": image, "prompt": f"Question: {question} Answer:"})
and I also try just feeding the prompt as is:
ans = model.generate({"image": image, "prompt": f"{question}"})
I'm trying to reproduce this example:
I receive the following output:
- I receive only pepperoni, and not the other ingredients
and for this one: That's the output I receive:
- I don't receive any explanation beyond yes. The paper figure shows "it's a house that looks like it's upside down"
How can I receive the behavior described in the paper?
Thanks