Simon Willison

Results 2765 comments of Simon Willison

Basic Gemini example from https://github.com/simonw/llm-gemini/blob/4195c4396834e5bccc3ce9a62647591e1b228e2e/llm_gemini.py (my `images` branch): ```python messages = [] if conversation: for response in conversation.responses: messages.append( {"role": "user", "parts": [{"text": response.prompt.prompt}]} ) messages.append({"role": "model", "parts": [{"text": response.text()}]})...

Example from Google AI Studio: ```bash API_KEY="YOUR_API_KEY" # TODO: Make the following files available on the local file system. FILES=("image.jpg") MIME_TYPES=("image/jpeg") for i in "${!FILES[@]}"; do NUM_BYTES=$(wc -c < "${FILES[$i]}")...

Here's Gemini Pro accepting multiple images at once: https://ai.google.dev/gemini-api/docs/vision?lang=python#prompt-multiple ```python import PIL.Image sample_file = PIL.Image.open('sample.jpg') sample_file_2 = PIL.Image.open('piranha.jpg') sample_file_3 = PIL.Image.open('firefighter.jpg') model = genai.GenerativeModel(model_name="gemini-1.5-pro") prompt = ( "Write an advertising...

I just saw Gemini has been trained to returning bounding boxes. https://ai.google.dev/gemini-api/docs/vision?lang=python#bbox I tried this: ```pycon >>> import google.generativeai as genai >>> genai.configure(api_key="...") >>> model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest") >>> import PIL.Image...

I don't think those bounding boxes are in the right places. I built a Claude Artifact to render them, and I may not have built it right, but I got...

Tried it again with this photo of goats and got slightly more convincing result: ![CleanShot 2024-08-25 at 20 31 40@2x](https://github.com/user-attachments/assets/d8507108-b0c3-46c1-ab41-d7931aa2a25f) ![goats](https://github.com/user-attachments/assets/636a7042-843b-4e65-913b-29bd1f35571a) ```pycon >>> goats = PIL.Image.open("/tmp/goats.jpeg") >>> prompt = 'Return...

Oh! I tried different varieties of coordinate and it turned out this one rendered correctly: ``` [255, 473, 800, 910] [96, 63, 700, 390] ``` Rendered: ![CleanShot 2024-08-25 at 20...

I mucked around a bunch and came up with this, which seems to work: https://static.simonwillison.net/static/2024/gemini-bounding-box-tool-fixed.html It does a better job with the pelicans, though clearly those boxes aren't right. The...

Fun, with this heron it found the reflection too: ![CleanShot 2024-08-25 at 21 01 56@2x](https://github.com/user-attachments/assets/63f6914a-98d9-4b52-ae23-2259a43ddec7) ![heron](https://github.com/user-attachments/assets/9caa06ef-590d-45f8-ad95-f05da7251e88) ```pycon >>> heron = PIL.Image.open("/tmp/heron.jpeg") >>> prompt = 'Return bounding boxes around every heron,...

Based on all of that, I built this tool: https://tools.simonwillison.net/gemini-bbox You have to paste in a Gemini API key when you use it, which gets stashed in `localStorage` (like my...