Image prompt with Gemini and GPT-4o-mini
Describe the bug I want to make evaluations with an image and a user prompt on Gemini 2.0 Flash and GPT-4o-mini. While 4o-mini image input works, Gemini gives me a bad request. Am i missing something or does image prompting with Gemini not work? To Reproduce This code produces the error in evaluation :
import promptfoo from "promptfoo";
const config = {
description: "Test",
prompts: [
() => [
{
role: "user",
content: [
{
type: "text",
text: `What is in this image?`,
},
{
type: "image_url",
image_url: {
url: `https://upload.wikimedia.org/wikipedia/commons/thumb/5/55/Foto_de_Jose_Eloy_Mart%C3%ADnez_Sim%C3%B3_%28edt.%29.jpg/1280px-Foto_de_Jose_Eloy_Mart%C3%ADnez_Sim%C3%B3_%28edt.%29.jpg`,
},
},
],
},
],
],
providers: [
{
id: "google:gemini-2.0-flash",
},
],
tests: [
{
assert: [
{
type: "not-is-json",
},
],
},
],
writeLatestResults: true,
};
async function run() {
await promptfoo.evaluate(config, {});
}
run();
Expected behavior
Gemini should use the image as an input for my prompt
Screenshots
- Promptfoo version: 0.109.0
@domephant Hey I don't think promptfoo would parse the prompts from openai format to gemini specific in this case, but you can make it work by specify the correct request format like:
const fs = require("fs");
const path = require("path");
const promptfoo = require("/Users/guangshuozang/Dev-Promptfoo/promptfoo");
const base64Image =
"iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII";
const mimeType = "image/png";
const config = {
description: "Test Gemini 2.0 Flash with inline image",
prompts: [
() => [
{
role: "user",
parts: [
{
text: "What is in this image?",
},
{
inline_data: {
mime_type: mimeType,
data: base64Image,
},
},
],
},
],
],
providers: [
{
id: "google:gemini-2.0-flash",
},
],
tests: [
{
assert: [
{
type: "not-is-json",
},
],
},
],
writeLatestResults: true,
};
async function run() {
const result = await promptfoo.evaluate(config, {});
console.log(JSON.stringify(result, null, 2));
}
run();
Hi @domephant,
Thanks for this question. The issue is that OpenAI and Gemini use different formats for image inputs - promptfoo doesn't automatically convert between them.
For evaluating both models, use provider-specific prompts:
openai-prompt.json:
[
{
"role": "user",
"content": [
{
"type": "text",
"text": "{{question}}"
},
{
"type": "image_url",
"image_url": {
"url": "{{imageUrl}}"
}
}
]
}
]
gemini-prompt.json:
[
{
"role": "user",
"parts": [
{
"text": "{{question}}"
},
{
"inline_data": {
"mime_type": "image/jpeg",
"data": "{{imageData}}"
}
}
]
}
]
promptfooconfig.yaml:
prompts:
- id: file://openai-prompt.json
label: openai_vision
- id: file://gemini-prompt.json
label: gemini_vision
providers:
- id: openai:gpt-4o-mini
prompts: [openai_vision]
- id: google:gemini-2.0-flash
prompts: [gemini_vision]
tests:
- vars:
question: What do you see?
imageUrl: https://your-image-url.jpg
imageData: file://path/to/image.jpg # file:// auto-converts to base64
See Prompts documentation for more on provider-specific prompts.
If you're still having issues, feel free to comment or re-open.
Thanks!