prompty icon indicating copy to clipboard operation
prompty copied to clipboard

Including Image in System Message

Open ThomasSpencerPugh opened this issue 9 months ago • 1 comments

Here is the abbreviated prompty:

---
name: PDF Page Classifier
description: Determines if a given page starts a new contract form, based on the top portion of the page and, if available, the following page.
version: 0.1.0
authors:
    - Thomas Pugh

model:
    api: chat
    configuration:
        type: serverless
        endpoint: XXXXXXXXXXXXXX
        model: Phi-3.5-vision-instruct
        key: XXXXXXXXXXXXXXXXXX
        response_format:
            type: json_object
    parameters:
        max_tokens: 512
        temperature: 0
    response: first

inputs:
    image_one:
        type: image
    image_two:
        type: image

outputs:
    thought:
        description: A concise explanation of the reasoning process used to analyze the provided pages.
        type: string
    result:
        description: A boolean value indicating whether the page starts a new form.
        type: boolean
    confidence:
        description: A string indicating the confidence level of the classification. ["high", "medium", "low", "minimal"]
        type: string

sample:
    image_one: ./test_images/false-1-1-small.png
    image_two XXXXXXXXXXX
---
system:
 ... cut for brevity ...
# Example

**Input Scenario 1:**
Page 1: 
![image](./test_images/false-1-1.png) <------------ EMPHASIS ON THIS LINE
Page 2:
![image]

**Output 1:**
```json
{
  "thought": "...brevity...",
  "result": false,
  "confidence": "high"
}

user:

Page 1:
![image]({{image_one}})
Page 2:
![image]({{image_two}})

An error arises in the flow, coming up starting from the "complete" block. This causes an error like so in the message response:

.... cut up to this point ...
 "name": "ChatCompletionsClient",
                                "__time": {
                                    "start": "2025-03-18T13:20:00.665691",
                                    "end": "2025-03-18T13:20:00.665691",
                                    "duration": 0
                                },
                                "type": "LLM",
                                "signature": "azure.ai.inference.ChatCompletionsClient.ctor",
                                "description": "Azure Unified Inference SDK Chat Completions Client",
                                "inputs": {
                                    "endpoint": "*************",
                                    "credential": "**********"
                                },
                                "result": "<azure.ai.inference._patch.ChatCompletionsClient object at 0x0000027591A9EC60>"
                            },
                            {
                                "name": "complete",
                                "__time": {
                                    "start": "2025-03-18T13:20:00.665691",
                                    "end": "2025-03-18T13:20:02.068908",
                                    "duration": 1403
                                },
                                "type": "LLM",
                                "signature": "azure.ai.inference.ChatCompletionsClient.complete",
                                "description": "Azure Unified Inference SDK Chat Completions Client",
                                "inputs": {
                                    "model": "Phi-3.5-vision-instruct",
                                    "messages": [
                                        {
                                            "role": "system",
                                            "content": [
                                                {
                                                    "type": "text",
                                                    "text": ... cut for brevity ...
                                                },
                                                {
                                                    "type": "image_url",
                                                    "image_url": {
                                                        "url": "data:image/png;base64, ...cut...
                                                    }
                                                },
                                                {
                                                    "type": "text",
                                                    "text": " ...cut...
                                                }
                                            ]
                                        },
                                        {
                                            "role": "user",
                                            "content": [
                                                {
                                                    "type": "text",
                                                    "text": "Page 1:"
                                                },
                                                {
                                                    "type": "image_url",
                                                    "image_url": {
                                                        "url": "data:image/png;base64, ...cut...
                                                    }
                                                },
                                                {
                                                    "type": "text",
                                                    "text": "Page 2:"
                                                },
                                                {
                                                    "type": "image_url",
                                                    "image_url": {
                                                        "url":  ... cut
   }
                                                }
                                            ]
                                        }
                                    ],
                                    "max_tokens": 512,
                                    "temperature": 0
                                }
                            }
                        ],
                        "result": {
                            "exception": {
                                "type": "<class 'azure.core.exceptions.HttpResponseError'>",
                                "traceback": [
                                    "  File \"C:\\Users\\tpugh\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\prompty\\tracer.py\", line 170, in wrapper\n    result = func(*args, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^\n",
                                    "  File \"C:\\Users\\tpugh\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\prompty\\invoker.py\", line 71, in run\n    return self.invoke(data)\n           ^^^^^^^^^^^^^^^^^\n",
                                    "  File \"C:\\Users\\tpugh\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\prompty\\serverless\\executor.py\", line 122, in invoke\n    r = client.complete(**eargs)\n        ^^^^^^^^^^^^^^^^^^^^^^^^\n",
                                    "  File \"C:\\Users\\tpugh\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\azure\\ai\\inference\\_patch.py\", line 738, in complete\n    raise HttpResponseError(response=response)\n"
                                ],
                                "message": "(Bad Request) {\"object\":\"error\",\"message\":\"3 validation errors for ValidatorIterator\\n1.text\\n  Field required [type=missing, input_value={'type': 'image_url', 'im...AFwAAAABJRU5ErkJggg=='}}, input_type=dict]\\n    For further information visit https://errors.pydantic.dev/2.9/v/missing\\n1.type\\n  Input should be 'text' [type=literal_error, input_value='image_url', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.9/v/literal_error\\n1.image_url\\n  Extra inputs are not permitted [type=extra_forbidden, input_value={'url': 'data:image/png;b...VAFwAAAABJRU5ErkJggg=='}, input_type=dict]\\n    For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden\",\"type\":\"BadRequestError\",\"param\":null,\"code\":400}\nCode: Bad Request\nMessage: {\"object\":\"error\",\"message\":\"3 validation errors for ValidatorIterator\\n1.text\\n  Field required [type=missing, input_value={'type': 'image_url', 'im...AFwAAAABJRU5ErkJggg=='}}, input_type=dict]\\n    For further information visit https://errors.pydantic.dev/2.9/v/missing\\n1.type\\n  Input should be 'text' [type=literal_error, input_value='image_url', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.9/v/literal_error\\n1.image_url\\n  Extra inputs are not permitted [type=extra_forbidden, input_value={'url': 'data:image/png;b...VAFwAAAABJRU5ErkJggg=='}, input_type=dict]\\n    For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden\",\"type\":\"BadRequestError\",\"param\":null,\"code\":400}",
                                "args": "('(Bad Request) {\"object\":\"error\",\"message\":\"3 validation errors for ValidatorIterator\\\\n1.text\\\\n  Field required [type=missing, input_value={\\'type\\': \\'image_url\\', \\'im...AFwAAAABJRU5ErkJggg==\\'}}, input_type=dict]\\\\n    For further information visit https://errors.pydantic.dev/2.9/v/missing\\\\n1.type\\\\n  Input should be \\'text\\' [type=literal_error, input_value=\\'image_url\\', input_type=str]\\\\n    For further information visit https://errors.pydantic.dev/2.9/v/literal_error\\\\n1.image_url\\\\n  Extra inputs are not permitted [type=extra_forbidden, input_value={\\'url\\': \\'data:image/png;b...VAFwAAAABJRU5ErkJggg==\\'}, input_type=dict]\\\\n    For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden\",\"type\":\"BadRequestError\",\"param\":null,\"code\":400}\\nCode: Bad Request\\nMessage: {\"object\":\"error\",\"message\":\"3 validation errors for ValidatorIterator\\\\n1.text\\\\n  Field required [type=missing, input_value={\\'type\\': \\'image_url\\', \\'im...AFwAAAABJRU5ErkJggg==\\'}}, input_type=dict]\\\\n    For further information visit https://errors.pydantic.dev/2.9/v/missing\\\\n1.type\\\\n  Input should be \\'text\\' [type=literal_error, input_value=\\'image_url\\', input_type=str]\\\\n    For further information visit https://errors.pydantic.dev/2.9/v/literal_error\\\\n1.image_url\\\\n  Extra inputs are not permitted [type=extra_forbidden, input_value={\\'url\\': \\'data:image/png;b...VAFwAAAABJRU5ErkJggg==\\'}, input_type=dict]\\\\n    For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden\",\"type\":\"BadRequestError\",\"param\":null,\"code\":400}',)"
                            }
                        }
                    }
                ],
... rest cut ...

It seems like it is caused by placing a static image in the system prompt, since removing it causes the issue to go away. Please let me know if I can provide additional context.

ThomasSpencerPugh avatar Mar 18 '25 17:03 ThomasSpencerPugh

Interesting - I will need to take a look at how to embed images in a serverless call in the SDK - maybe things need to be parsed differently for that provider.

sethjuarez avatar Mar 23 '25 07:03 sethjuarez

Looks like there are some models that don't support images in the system message :/ Feel free to reopen if you know differently - happy to adjust.

sethjuarez avatar Jul 11 '25 01:07 sethjuarez