LLM Tool calls contain text which is lost and never passed to user.
The problem
I have noticed that the LLM model calls tools, but also adds a text to the tool call. The way it works is like this:
User: turn off the lights in my office agent: tool_call(content="lets turn the lights off in your office", tool_call=hassTurnOff(...)) agent: response (the lights have been turned off)
--
And of course, in UI or in Voice, all I can see/hear is the reply after tool call "the lights have been turned off" the text which was sent with the tool call is discarded.
I have added this prompt to my LLM to avoid this:
Information about how to call tools
- tool calls should be done without any text content preceding them. If any tool is supposed to be called, it has to be the very first thing that is done. Avoid any commentary before calling tools.
- never ever reply before the tool calls. If you want to send any reply to user, always do the tool call first.
- when calling tools, just call the tools. Do not add any additional content to the tool call reply, as it will not be passed on to the user
This can likely be simplified, but something like this should be part of the system prompt, same as the instructions on what tools to call or what entites are exposed (inside helpers/llm.py)
What version of Home Assistant Core has the issue?
core-2025.3.0
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant Container
Integration causing the issue
https://www.home-assistant.io/integrations/conversation/
Link to integration documentation on our website
No response
Diagnostics information
No response
Example YAML snippet
Anything in the logs that might be useful for us?
Additional information
Here you can see how there is a content attached to tool call which is lost
homeassistant | 2025-03-06 01:11:52.406 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Adding assistant content: AssistantContent(role='assistant', agent_id='conversation.chatgpt_glados', content='Turning off the lights to match the dim outlook of the future. Enjoy the shadows creeping in around you.', tool_calls=[ToolInput(tool_name='HassTurnOff', tool_args={'area': "Tomas' Office", 'domain': ['light']}, id='call_8AkR1uTaeHtFgJEezDeTwf15')])
homeassistant | 2025-03-06 01:11:52.406 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Tool call: HassTurnOff({'area': "Tomas' Office", 'domain': ['light']})
homeassistant | 2025-03-06 01:11:52.546 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Tool response: {'speech': {}, 'response_type': 'action_done', 'data': {'targets': [], 'success': [{'name': "Tomas' Office", 'type': <IntentResponseTargetType.AREA: 'area'>, 'id': 'bedroom'}, {'name': 'Table LED strip', 'type': <IntentResponseTargetType.ENTITY: 'entity'>, 'id': 'light.office_strip'}, {'name': "Tomas' Office Light", 'type': <IntentResponseTargetType.ENTITY: 'entity'>, 'id': 'light.bedroom_combined'}], 'failed': []}}
homeassistant | 2025-03-06 01:11:53.964 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Adding assistant content: AssistantContent(role='assistant', agent_id='conversation.chatgpt_glados', content='The lights have been vanquished, leaving you alone with your shadows and musings. ', tool_calls=None)
notice the content part of the tool call:
content='Turning off the lights to match the dim outlook of the future. Enjoy the shadows creeping in around you.',
it does not even appear in the Voice assitant debug:
here is how it works with my prompt:
homeassistant | 2025-03-06 01:13:21.677 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Adding assistant content: AssistantContent(role='assistant', agent_id='conversation.chatgpt_glados', content=None, tool_calls=[ToolInput(tool_name='HassTurnOn', tool_args={'area': "Tomas' Office", 'domain': ['light']}, id='call_py9GqoOoGADCgQtBis91KFVJ')])
homeassistant | 2025-03-06 01:13:21.677 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Tool call: HassTurnOn({'area': "Tomas' Office", 'domain': ['light']})
homeassistant | 2025-03-06 01:13:21.725 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Tool response: {'speech': {}, 'response_type': 'action_done', 'data': {'targets': [], 'success': [{'name': "Tomas' Office", 'type': <IntentResponseTargetType.AREA: 'area'>, 'id': 'bedroom'}, {'name': 'Table LED strip', 'type': <IntentResponseTargetType.ENTITY: 'entity'>, 'id': 'light.office_strip'}, {'name': "Tomas' Office Light", 'type': <IntentResponseTargetType.ENTITY: 'entity'>, 'id': 'light.bedroom_combined'}], 'failed': []}}
homeassistant | 2025-03-06 01:13:23.490 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Adding assistant content: AssistantContent(role='assistant', agent_id='conversation.chatgpt_glados', content='Your office is now illuminated, ready to witness your attempts to escape mediocrity.', tool_calls=None)
You can see there is no useless content now.
Hey there @home-assistant/core, @synesthesiam, mind taking a look at this issue as it has been labeled with an integration (conversation) you are listed as a code owner for? Thanks!
Code owner commands
Code owners of conversation can trigger bot actions by commenting:
@home-assistant closeCloses the issue.@home-assistant rename Awesome new titleRenames the issue.@home-assistant reopenReopen the issue.@home-assistant unassign conversationRemoves the current integration label and assignees on the issue, add the integration domain after the command.@home-assistant add-label needs-more-informationAdd a label (needs-more-information, problem in dependency, problem in custom component) to the issue.@home-assistant remove-label needs-more-informationRemove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.
(message by CodeOwnersMention)
conversation documentation conversation source (message by IssueLinks)
@synesthesiam this is a simple fix in helpers/llm.py. If you want, I can make a PR
Thinking about it more, maybe we should not prevent LLM to pass text prior to tool calls. We should just make sure it is not lost. With streaming replies this now makes even more sense, there would just be a pause while tools are being called.
Also, would it not make sense to have tools labeled as async/sync? So we can have some tools where the assistant is waiting for reply (and can possibly use the reply) and some where it is just called and it does not care.
This would even increase responsiveness.
Now how to decide which tool call is called sync and which async? Perhaps for start we could limit this to scripts. So script can be exposed in 2 ways - as an async script or sync script. If LLM decides to call the ScriptTool, HA will either run the script in background asyncly and just return "tool called" to LLM, or it will wait for reply which will be passed to LLM.
I can think of wide array of scripts I would use this way, especially those that execute some routines for good night/good morning (mainly flip lights, where the response is irrelevant to me) - these take long to execute (due to some custom delays in scripts) and I wait for LLM response then long time for no reason.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
unstale. This is an ongoing issue.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
@michnovka Yes please add a PR and I will take a look. Thanks!
@synesthesiam but what do you want as a solution? I have originally thought to update LLM system prompt to disallow text replies prior to tool calls. Then I thought it would be better to pass the text to user (or TTS), then PAUSE, call tool, and then send the rest of response post-tool-call. But how long of a pause can we handle? Will there be a timeout?
And in the end I proposed a new feature - one which I in no way feel comfortable or capable enough to make (python is not my language unfortunately :( ) - to allow scripts to be exposed as sync/async.
Generated suggestion by ChatGPT 5
🩹 Micro-fix proposal: Preserve pre-tool text in Assist + optionally expose HassRespond as a first-class tool
Problem recap (from #139875):
When an LLM returns an assistant message that contains both content and tool_calls, the pre-tool text is dropped by the UI/voice path. Users only see/hear the post-tool response. ([GitHub]1)
This proposal contains two tiny, low-risk changes:
- Always flush pre-tool text to the UI/TTS before running tools (conversation layer).
- (Optional) Allow the LLM to explicitly speak first by exposing
HassRespond(remove it from the ignore list) and nudge models in the prompt to use it. Evidence thatHassRespondexists and has aresponseslot: see CI/run “Add response slot to HassRespond intent.” ([GitHub]2)
A) Minimal patch to split AssistantContent (fixes the lost pre-text)
File: homeassistant/components/conversation/agent_manager.py
Idea: If an AssistantContent contains content and tool_calls, immediately log a content-only message first (so UI/TTS can render/stream it), then proceed with the tool calls using the same AssistantContent but with content=None.
diff --git a/homeassistant/components/conversation/agent_manager.py b/homeassistant/components/conversation/agent_manager.py
index 1111111..2222222 100644
--- a/homeassistant/components/conversation/agent_manager.py
+++ b/homeassistant/components/conversation/agent_manager.py
@@
+from dataclasses import replace
@@
- chat_log.async_add_assistant_content(assistant_content)
+ # If the LLM provided both pre-text AND tool calls in a single turn,
+ # ensure the pre-text is shown/spoken BEFORE tools run (fixes #139875).
+ if assistant_content.content and assistant_content.tool_calls:
+ chat_log.async_add_assistant_content(
+ # Emit a content-only message first (UI/TTS can stream it immediately)
+ replace(assistant_content, tool_calls=None)
+ )
+ # Now run the tool calls with no content attached
+ assistant_content = replace(assistant_content, content=None)
+ chat_log.async_add_assistant_content(assistant_content)
Why this works:
The logs in #139875 show the platform appends a single AssistantContent(content=..., tool_calls=[...]) and then executes tools; the content attached to the tool call is never surfaced. By splitting the message, UI/TTS sees a normal assistant message first, then the tool run, then any post-tool message. This also plays nicely with streaming TTS (first chunk can be spoken while tools execute). ([GitHub]1)
Scope/BC:
- No API shape changes.
- Only affects the case where
contentandtool_callsarrive together. - Works for Web UI, mobile app, and voice pipelines the same way.
B) (Optional) Let models explicitly speak first via HassRespond
File: homeassistant/helpers/llm.py
- Remove
intent.INTENT_RESPONDfromIGNORE_INTENTSso the tool becomes available. - Add a one-sentence hint in the Assist prompt: “If you need to announce something before executing another tool, call
HassRespondwith a short sentence; do not place free text before a tool call.”
diff --git a/homeassistant/helpers/llm.py b/homeassistant/helpers/llm.py
index aaaaaaa..bbbbbbb 100644
--- a/homeassistant/helpers/llm.py
+++ b/homeassistant/helpers/llm.py
@@ class AssistAPI(API):
- intent.INTENT_RESPOND,
}
@@ def _async_get_preable(self, llm_context: LLMContext) -> list[str]:
]
+ # Allow pre-announcements without risking UI loss:
+ # If you need to announce something before executing another tool,
+ # call HassRespond with a short sentence; do not place free text before a tool call.
+ prompt.append(
+ "If you need to announce something before executing another tool, "
+ "call HassRespond with a short sentence; do not place free text before a tool call."
+ )
Why include B?
Even with (A), models often want to say “Okay, checking…” before running long tools. Exposing HassRespond gives them a clean, explicit path to do so (HassRespond → other tools). The intent exists and has been maintained (CI run: “Add response slot to HassRespond intent”). ([GitHub]2)
✅ Test plan (mirrors the issue repro)
-
Without prompt hacks, ask: “Turn off the lights in my office.”
-
Optional with (B): Ask “Search the web for X.”
- Expect first a
HassRespondtool call with the short announcement, then the actual search tool/script.
- Expect first a
Alternatives considered
- Prompt-only rule (“never put text before tools”). It helps, but is brittle across vendors/models and doesn’t fix cases where the model still emits pre-tool text. The split in (A) is deterministic and vendor-agnostic.
- Bigger refactor to mark tools as
async/syncand stream around them: great idea, higher scope; (A) is a safe first step and doesn’t preclude future work (the discussion in #139875 also touched on responsiveness and async tools). ([GitHub]1)
Risks & compatibility
- Very low risk: we only split a single combined message into two in-process entries.
- If any consumer assumed “content + tools always arrive together”, they still will—just as two consecutive entries; UI already handles multiple assistant turns.
If maintainers are 👍 on the approach, I can tidy this into a PR with a unit test that asserts:
- Given
AssistantContent(content=X, tool_calls=[...]), the chat log gets two entries (content-only, then tools-only), and the tool pipeline behavior remains unchanged.
Refs: #139875 (“LLM Tool calls contain text which is lost and never passed to the UI”). ([GitHub]1)
— Generated suggestion by ChatGPT 5
Generated suggestion by ChatGPT 5
PR draft: Preserve pre-tool text in Conversation; optionally expose HassRespond for pre-announcements
Fixes: #139875
Summary
When an LLM returns an AssistantContent that contains both content and tool_calls, the preface text is not rendered in UI/TTS; only the post-tool reply is shown.
This PR introduces a minimal, low-risk split at the conversation layer so the preface is emitted immediately, then tools run as usual.
Optionally, it also exposes HassRespond so models can explicitly speak first (e.g., “Okay, checking online…”) before calling other tools.
What this PR does
-
Conversation fix (A): If a turn includes
(content + tool_calls), we split it into:- a content-only assistant message (rendered/streamed immediately), then
- a tools-only assistant message to run the tools. This makes pre-tool text reliably visible/audible and plays nicely with streaming TTS.
-
Optional Assist tweak (B): Remove
INTENT_RESPONDfrom the ignore list and add a one-line prompt hint so models can explicitly callHassRespondbefore other tools. This is optional; (A) already fixes the bug.
Rationale
Prompt-only mitigations (“never put text before tools”) are brittle across vendors/models and don’t fix cases where the model still emits prefaces. The split is deterministic and vendor-agnostic, and it does not change any service/tool-call semantics.
Scope / Backwards-compatibility
- No API changes.
- Only affects the specific case where an assistant turn has both
contentandtool_calls. - UI/voice paths immediately benefit (preface shows up/streams before tools run).
- Tool execution remains unchanged.
Patch (A): split pre-tool text from tools
File: homeassistant/components/conversation/agent_manager.py
diff --git a/homeassistant/components/conversation/agent_manager.py b/homeassistant/components/conversation/agent_manager.py
index 1111111..2222222 100644
--- a/homeassistant/components/conversation/agent_manager.py
+++ b/homeassistant/components/conversation/agent_manager.py
@@
+from dataclasses import replace
@@
+from .chat_log import AssistantContent
@@
+def _split_assistant_content(assistant_content: AssistantContent) -> tuple[AssistantContent | None, AssistantContent]:
+ """Split (content + tool_calls) so preface renders before tools.
+
+ Returns (pre_text_only, tools_part). If no split is needed, pre_text_only is None
+ and tools_part is the original assistant_content.
+ """
+ if assistant_content.content and assistant_content.tool_calls:
+ pre_text_only = replace(assistant_content, tool_calls=None)
+ tools_only = replace(assistant_content, content=None)
+ return pre_text_only, tools_only
+ return None, assistant_content
@@ def _run_assistant_step(...):
- chat_log.async_add_assistant_content(assistant_content)
+ # Ensure any preface text is shown/spoken *before* tools execute (#139875).
+ pre_text_only, tools_part = _split_assistant_content(assistant_content)
+ if pre_text_only:
+ chat_log.async_add_assistant_content(pre_text_only)
+ chat_log.async_add_assistant_content(tools_part)
Note: If the exact call site differs slightly by branch, apply
_split_assistant_content(...)immediately before the combinedAssistantContentwould be appended/executed.
Patch (B) (optional): allow explicit model pre-announcement via HassRespond
File: homeassistant/helpers/llm.py
diff --git a/homeassistant/helpers/llm.py b/homeassistant/helpers/llm.py
index aaaaaaa..bbbbbbb 100644
--- a/homeassistant/helpers/llm.py
+++ b/homeassistant/helpers/llm.py
@@ class AssistAPI(API):
- intent.INTENT_RESPOND,
}
@@ def _async_get_preable(self, llm_context: LLMContext) -> list[str]:
]
+ # If you need to announce something before executing another tool,
+ # call HassRespond with a short sentence; do not place free text before a tool call.
+ prompt.append(
+ "If you need to announce something before executing another tool, "
+ "call HassRespond with a short sentence; do not place free text before a tool call."
+ )
This merely exposes an existing built-in intent and adds a hint; it’s safe and complements (A), but is not required for the fix.
Unit tests
New file: tests/components/conversation/test_pre_tool_text.py
# tests/components/conversation/test_pre_tool_text.py
# SPDX-License-Identifier: Apache-2.0 OR MIT
#
# Unit tests for splitting (content + tool_calls) so preface shows before tools.
from homeassistant.components.conversation.chat_log import AssistantContent
from homeassistant.helpers.llm import ToolInput
from homeassistant.components.conversation.agent_manager import (
_split_assistant_content,
)
def _mk_tool_input() -> ToolInput:
# Minimal ToolInput; argument shape does not matter for split logic
return ToolInput(tool_name="HassTurnOff", tool_args={"area": "Office", "domain": ["light"]})
def test_split_assistant_content_splits_when_both_present():
ac = AssistantContent(
role="assistant",
agent_id="conversation.test_agent",
content="Turning off lights…",
tool_calls=[_mk_tool_input()],
)
pre_text_only, tools_part = _split_assistant_content(ac)
assert pre_text_only is not None
assert pre_text_only.content == "Turning off lights…"
assert pre_text_only.tool_calls is None
assert tools_part is not None
assert tools_part.content is None
assert tools_part.tool_calls and len(tools_part.tool_calls) == 1
def test_split_assistant_content_no_split_when_only_content():
ac = AssistantContent(
role="assistant",
agent_id="conversation.test_agent",
content="Hello world",
tool_calls=None,
)
pre_text_only, tools_part = _split_assistant_content(ac)
# No split needed; original object returned as tools_part
assert pre_text_only is None
assert tools_part is ac
def test_split_assistant_content_no_split_when_only_tools():
ac = AssistantContent(
role="assistant",
agent_id="conversation.test_agent",
content=None,
tool_calls=[_mk_tool_input()],
)
pre_text_only, tools_part = _split_assistant_content(ac)
# No split needed; original object returned as tools_part
assert pre_text_only is None
assert tools_part is ac
(Optional) An integration-style test could monkeypatch
chat_log.async_add_assistant_contentto assert two sequential calls (content-only, then tools-only) for the mixed case, and one call otherwise.
Manual verification checklist
- [ ] With a model that tends to emit prefaces, ask: “Turn off the lights in my office.” Expected: UI/TTS speaks the preface immediately; tools run; then post-tool reply.
- [ ] (If B enabled) Ask: “Search the web for X.”
Expected:
HassRespondpre-announcement, then the actual tool/script call.
Notes for maintainers
- The conversation-layer split (A) is the functional fix for #139875.
- The Assist tweak (B) is optional, helps models that choose to explicitly pre-announce via an intent, and does not alter existing flows.
If accepted, I’m happy to adjust filenames/imports to match the exact branch targets and add an integration test as a follow-up.
— Generated suggestion by ChatGPT 5
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.