gemini-openai-proxy [Bug] Does not support single str for vision or multiple texts input for non-vision

[Bug] Does not support single str for vision or multiple texts input for non-vision

Open ZihaoZhou opened this issue 6 months ago • 1 comments

First thank @zhu327 and @ekatiyar for your great works.

I am using this fork of repository and notice that

curl http://localhost:8080/v1/chat/completions \
 -H "Content-Type: application/json" \
 -H "Authorization: Bearer $GOOGLE_API_KEY" \
 -d '{
     "model": "gemini-1.5-vision-latest",
     "messages": [{"role": "user", "content": "Say this is a test."}],
     "temperature": 0.7
 }'

{"code":400,"message":"message.multiContent: json.Unmarshal: json: cannot unmarshal string into Go value of type []openai.ChatMessagePart","type":""}

does not work, because the proxy forces the vision input to have multiple parts. But in practice, both openai and gemini vision models can accept input without images. I also notice that

curl http://localhost:8080/v1/chat/completions \
 -H "Content-Type: application/json" \
 -H "Authorization: Bearer $GOOGLE_API_KEY" \
 -d '{
     "model": "gemini-1.0-pro-latest",
     "messages": [{"role": "user", "content": [
        {"type": "text", "text": "Paraphrase this sentence."},
        {"type": "text", "text": "Say this is a test."}
     ]}],
     "temperature": 0.7
 }'

does not work, because the proxy forces the text input to have single part. But in practice, both openai and gemini models accept multiple text parts.

Since gemini-1.0-pro-vision was already deprecated last month, I don't see why we cannot simply aggregate toStringGenaiContent() and toVisionGenaiContent() to handle all forms of input. I implemented it in my fork and it solves both bugs. It also avoid the extra environment variable to toggle between gemini flash and pro vision.

Aug 07 '24 20:08 ZihaoZhou

gemini-openai-proxy gemini-openai-proxy copied to clipboard

[Bug] Does not support single str for vision or multiple texts input for non-vision

gemini-openai-proxy
gemini-openai-proxy copied to clipboard