Greg/docker mocha compilation failure test/pl 1263
Fixes or implements VF-XXX
Brief description. What is this change?
test for https://github.com/voiceflow/general-runtime/pull/867
So you have issues with your workflow. ha-gpt4vision uses a service you could use, but input needs to be an image. Your goal needs to be that the response of ha-gpt4vision gets back to your conversation.
So I'd recommend that you write an Extended OpenAI Conversation script that includes the complete workflow. Note: All response variables are dicts!
Here's a template I generated for you with ChatGPT without modifying it, but so that you have an idea how to start:
- spec:
name: capture_and_analyze_photo
description: >
Captures a photo, sends the URL to the ha-gpt4vision service, and retrieves the description of the photo.
parameters:
type: object
properties:
camera_entity_id:
type: string
description: The entity_id of the camera to capture the photo from.
required:
- camera_entity_id
function:
type: composite
sequence:
- type: script
sequence:
- service: camera.snapshot
target:
entity_id: "{{ camera_entity_id }}"
data:
filename: "/config/www/tmp/photo.jpg"
- service: homeassistant.update_entity
target:
entity_id: "{{ camera_entity_id }}"
response_variable: photo_url
- type: template
value_template: >
{{ "http://your-home-assistant-url:8123/local/tmp/photo.jpg" }}
response_variable: photo_url
- type: script
sequence:
- service: ha-gpt4vision.analyze_photo
data:
photo_url: "{{ photo_url }}"
response_variable: photo_description
- type: template
value_template: >
{{ photo_description }}
response_variable: final_description
Uploading a picture to a vision complaint OpenAI model was added to Extended OpenAI Conversation several months ago, @jleinenbach @The-Erf .
https://github.com/jekalmin/extended_openai_conversation/issues/43
You can use a sentence trigger through the HA GUI with keywords to trigger what kind of image analysis you want or simply ask in your native language and let ChatGPT "understand" what you say.
Uploading a picture to a vision complaint OpenAI model was added to Extended OpenAI Conversation several months ago, @jleinenbach @The-Erf .
https://github.com/jekalmin/extended_openai_conversation/issues/43
You can use a sentence trigger through the HA GUI with keywords to trigger what kind of image analysis you want or simply ask in your native language and let ChatGPT "understand" what you say.
Tanks can you Explain more ؟
The developer outlines it here: https://github.com/jekalmin/extended_openai_conversation/pull/60
This spec, as taken from this post, allows you to chat with Extended OpenAI about an image (multiple images if you want):
- spec:
name: vision
description: Analyze images
parameters:
type: object
properties:
message:
type: string
description: Analyze the images as requested by the user
required:
- request
function:
type: script
sequence:
- service: gpt4vision.image_analyzer
data:
max_tokens: 400
message: "{{request}}"
image_file: |-
/media/Allarme_Camera.jpg
/media/Allarme_Sala1.jpg
/media/Snapshot_Giardino1_20240425-090813.jpg
provider: OpenAI
model: gpt-4-vision-preview
target_width: 1280
temperature: 0.5
response_variable: _function_result
@valentinfrlch Super beginner question here, but where do I add the spec for this to work, please? I'm just getting started with Home Assistant, so still figuring things out. Greatful for any guidance anyone can give on this!
So I assume you have installed OpenAI Extended Conversation.
- Go to Settings > Devices & services > Extended OpenAI Conversation
- You should see your entry here, if not, you need to add the integration first.
- Click the configure button. There should be a "Functions" text field in the dialog. This is where the specs go.
The spec posted here has also been updated. You can find the updated version in the wiki of gpt4vision
Also note that you need to install gpt4vision (a separate integration) for this spec to work. You can do so through HACS, just follow the instructions here.
Thanks very much. I had an existing spec, so wasn't sure wether to replace or append, but seems like append is the way to go. Thank you @valentinfrlch !