CopilotChat.nvim icon indicating copy to clipboard operation
CopilotChat.nvim copied to clipboard

Add support to Google Gemini 2.5 Pro Image Input

Open pidgeon777 opened this issue 8 months ago • 3 comments

It seems that with Google Gemini 2.5 Pro it is possible to specify images as input:

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -d '{
  "model": "google/gemini-2.5-pro-exp-03-25:free",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
          }
        }
      ]
    }
  ]
  
}'

How could this functionality be supported in CopilotChat.nvim? I suppose that's the same model currently provided by GitHub Copilot.

pidgeon777 avatar Apr 16 '25 16:04 pidgeon777

There is some really interesting info here:

Using vision input in Copilot Chat with Claude and Gemini is now in public preview

Also:

https://docs.github.com/en/copilot/using-github-copilot/copilot-chat/asking-github-copilot-questions-in-your-ide#using-images-in-copilot-chat

pidgeon777 avatar Apr 17 '25 09:04 pidgeon777

Its something that I can probably look at after finishing tools as there I have ability to differentiate between resource types so I should be able to detect images probably maybe, but it wont be easy to support either way i think

deathbeam avatar Apr 17 '25 09:04 deathbeam

This discussion has some hints:

https://github.com/olimorris/codecompanion.nvim/discussions/1208

pidgeon777 avatar Apr 17 '25 17:04 pidgeon777

This is also actually duplicate of #970 , did not realized vision and image input is same thing at first

deathbeam avatar Aug 17 '25 09:08 deathbeam