core icon indicating copy to clipboard operation
core copied to clipboard

LLM Tool calls contain text which is lost and never passed to user.

Open michnovka opened this issue 10 months ago • 11 comments

The problem

I have noticed that the LLM model calls tools, but also adds a text to the tool call. The way it works is like this:

User: turn off the lights in my office agent: tool_call(content="lets turn the lights off in your office", tool_call=hassTurnOff(...)) agent: response (the lights have been turned off)

--

And of course, in UI or in Voice, all I can see/hear is the reply after tool call "the lights have been turned off" the text which was sent with the tool call is discarded.

I have added this prompt to my LLM to avoid this:

Information about how to call tools

  • tool calls should be done without any text content preceding them. If any tool is supposed to be called, it has to be the very first thing that is done. Avoid any commentary before calling tools.
  • never ever reply before the tool calls. If you want to send any reply to user, always do the tool call first.
  • when calling tools, just call the tools. Do not add any additional content to the tool call reply, as it will not be passed on to the user

This can likely be simplified, but something like this should be part of the system prompt, same as the instructions on what tools to call or what entites are exposed (inside helpers/llm.py)

What version of Home Assistant Core has the issue?

core-2025.3.0

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Container

Integration causing the issue

https://www.home-assistant.io/integrations/conversation/

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet


Anything in the logs that might be useful for us?


Additional information

Here you can see how there is a content attached to tool call which is lost

homeassistant      | 2025-03-06 01:11:52.406 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Adding assistant content: AssistantContent(role='assistant', agent_id='conversation.chatgpt_glados', content='Turning off the lights to match the dim outlook of the future. Enjoy the shadows creeping in around you.', tool_calls=[ToolInput(tool_name='HassTurnOff', tool_args={'area': "Tomas' Office", 'domain': ['light']}, id='call_8AkR1uTaeHtFgJEezDeTwf15')])
homeassistant      | 2025-03-06 01:11:52.406 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Tool call: HassTurnOff({'area': "Tomas' Office", 'domain': ['light']})
homeassistant      | 2025-03-06 01:11:52.546 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Tool response: {'speech': {}, 'response_type': 'action_done', 'data': {'targets': [], 'success': [{'name': "Tomas' Office", 'type': <IntentResponseTargetType.AREA: 'area'>, 'id': 'bedroom'}, {'name': 'Table LED strip', 'type': <IntentResponseTargetType.ENTITY: 'entity'>, 'id': 'light.office_strip'}, {'name': "Tomas' Office Light", 'type': <IntentResponseTargetType.ENTITY: 'entity'>, 'id': 'light.bedroom_combined'}], 'failed': []}}
homeassistant      | 2025-03-06 01:11:53.964 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Adding assistant content: AssistantContent(role='assistant', agent_id='conversation.chatgpt_glados', content='The lights have been vanquished, leaving you alone with your shadows and musings. ', tool_calls=None)

notice the content part of the tool call:

content='Turning off the lights to match the dim outlook of the future. Enjoy the shadows creeping in around you.',

it does not even appear in the Voice assitant debug:

Image


here is how it works with my prompt:

homeassistant      | 2025-03-06 01:13:21.677 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Adding assistant content: AssistantContent(role='assistant', agent_id='conversation.chatgpt_glados', content=None, tool_calls=[ToolInput(tool_name='HassTurnOn', tool_args={'area': "Tomas' Office", 'domain': ['light']}, id='call_py9GqoOoGADCgQtBis91KFVJ')])
homeassistant      | 2025-03-06 01:13:21.677 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Tool call: HassTurnOn({'area': "Tomas' Office", 'domain': ['light']})
homeassistant      | 2025-03-06 01:13:21.725 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Tool response: {'speech': {}, 'response_type': 'action_done', 'data': {'targets': [], 'success': [{'name': "Tomas' Office", 'type': <IntentResponseTargetType.AREA: 'area'>, 'id': 'bedroom'}, {'name': 'Table LED strip', 'type': <IntentResponseTargetType.ENTITY: 'entity'>, 'id': 'light.office_strip'}, {'name': "Tomas' Office Light", 'type': <IntentResponseTargetType.ENTITY: 'entity'>, 'id': 'light.bedroom_combined'}], 'failed': []}}
homeassistant      | 2025-03-06 01:13:23.490 DEBUG (MainThread) [homeassistant.components.conversation.chat_log] Adding assistant content: AssistantContent(role='assistant', agent_id='conversation.chatgpt_glados', content='Your office is now illuminated, ready to witness your attempts to escape mediocrity.', tool_calls=None)

You can see there is no useless content now.

Image

michnovka avatar Mar 06 '25 00:03 michnovka

Hey there @home-assistant/core, @synesthesiam, mind taking a look at this issue as it has been labeled with an integration (conversation) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of conversation can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Renames the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign conversation Removes the current integration label and assignees on the issue, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


conversation documentation conversation source (message by IssueLinks)

home-assistant[bot] avatar Mar 06 '25 00:03 home-assistant[bot]

@synesthesiam this is a simple fix in helpers/llm.py. If you want, I can make a PR

michnovka avatar Mar 06 '25 00:03 michnovka

Thinking about it more, maybe we should not prevent LLM to pass text prior to tool calls. We should just make sure it is not lost. With streaming replies this now makes even more sense, there would just be a pause while tools are being called.

Also, would it not make sense to have tools labeled as async/sync? So we can have some tools where the assistant is waiting for reply (and can possibly use the reply) and some where it is just called and it does not care.

This would even increase responsiveness.

Now how to decide which tool call is called sync and which async? Perhaps for start we could limit this to scripts. So script can be exposed in 2 ways - as an async script or sync script. If LLM decides to call the ScriptTool, HA will either run the script in background asyncly and just return "tool called" to LLM, or it will wait for reply which will be passed to LLM.

I can think of wide array of scripts I would use this way, especially those that execute some routines for good night/good morning (mainly flip lights, where the response is irrelevant to me) - these take long to execute (due to some custom delays in scripts) and I wait for LLM response then long time for no reason.

michnovka avatar Mar 08 '25 10:03 michnovka

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

unstale. This is an ongoing issue.

michnovka avatar Jun 06 '25 13:06 michnovka

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@michnovka Yes please add a PR and I will take a look. Thanks!

synesthesiam avatar Sep 04 '25 21:09 synesthesiam

@synesthesiam but what do you want as a solution? I have originally thought to update LLM system prompt to disallow text replies prior to tool calls. Then I thought it would be better to pass the text to user (or TTS), then PAUSE, call tool, and then send the rest of response post-tool-call. But how long of a pause can we handle? Will there be a timeout?

And in the end I proposed a new feature - one which I in no way feel comfortable or capable enough to make (python is not my language unfortunately :( ) - to allow scripts to be exposed as sync/async.

michnovka avatar Sep 08 '25 09:09 michnovka

Generated suggestion by ChatGPT 5


🩹 Micro-fix proposal: Preserve pre-tool text in Assist + optionally expose HassRespond as a first-class tool

Problem recap (from #139875): When an LLM returns an assistant message that contains both content and tool_calls, the pre-tool text is dropped by the UI/voice path. Users only see/hear the post-tool response. ([GitHub]1)

This proposal contains two tiny, low-risk changes:

  1. Always flush pre-tool text to the UI/TTS before running tools (conversation layer).
  2. (Optional) Allow the LLM to explicitly speak first by exposing HassRespond (remove it from the ignore list) and nudge models in the prompt to use it. Evidence that HassRespond exists and has a response slot: see CI/run “Add response slot to HassRespond intent.” ([GitHub]2)

A) Minimal patch to split AssistantContent (fixes the lost pre-text)

File: homeassistant/components/conversation/agent_manager.py

Idea: If an AssistantContent contains content and tool_calls, immediately log a content-only message first (so UI/TTS can render/stream it), then proceed with the tool calls using the same AssistantContent but with content=None.

diff --git a/homeassistant/components/conversation/agent_manager.py b/homeassistant/components/conversation/agent_manager.py
index 1111111..2222222 100644
--- a/homeassistant/components/conversation/agent_manager.py
+++ b/homeassistant/components/conversation/agent_manager.py
@@
+from dataclasses import replace
@@
-    chat_log.async_add_assistant_content(assistant_content)
+    # If the LLM provided both pre-text AND tool calls in a single turn,
+    # ensure the pre-text is shown/spoken BEFORE tools run (fixes #139875).
+    if assistant_content.content and assistant_content.tool_calls:
+        chat_log.async_add_assistant_content(
+            # Emit a content-only message first (UI/TTS can stream it immediately)
+            replace(assistant_content, tool_calls=None)
+        )
+        # Now run the tool calls with no content attached
+        assistant_content = replace(assistant_content, content=None)
+    chat_log.async_add_assistant_content(assistant_content)

Why this works: The logs in #139875 show the platform appends a single AssistantContent(content=..., tool_calls=[...]) and then executes tools; the content attached to the tool call is never surfaced. By splitting the message, UI/TTS sees a normal assistant message first, then the tool run, then any post-tool message. This also plays nicely with streaming TTS (first chunk can be spoken while tools execute). ([GitHub]1)

Scope/BC:

  • No API shape changes.
  • Only affects the case where content and tool_calls arrive together.
  • Works for Web UI, mobile app, and voice pipelines the same way.

B) (Optional) Let models explicitly speak first via HassRespond

File: homeassistant/helpers/llm.py

  1. Remove intent.INTENT_RESPOND from IGNORE_INTENTS so the tool becomes available.
  2. Add a one-sentence hint in the Assist prompt: “If you need to announce something before executing another tool, call HassRespond with a short sentence; do not place free text before a tool call.”
diff --git a/homeassistant/helpers/llm.py b/homeassistant/helpers/llm.py
index aaaaaaa..bbbbbbb 100644
--- a/homeassistant/helpers/llm.py
+++ b/homeassistant/helpers/llm.py
@@ class AssistAPI(API):
-        intent.INTENT_RESPOND,
     }
@@ def _async_get_preable(self, llm_context: LLMContext) -> list[str]:
         ]
+        # Allow pre-announcements without risking UI loss:
+        # If you need to announce something before executing another tool,
+        # call HassRespond with a short sentence; do not place free text before a tool call.
+        prompt.append(
+            "If you need to announce something before executing another tool, "
+            "call HassRespond with a short sentence; do not place free text before a tool call."
+        )

Why include B? Even with (A), models often want to say “Okay, checking…” before running long tools. Exposing HassRespond gives them a clean, explicit path to do so (HassRespond → other tools). The intent exists and has been maintained (CI run: “Add response slot to HassRespond intent”). ([GitHub]2)


✅ Test plan (mirrors the issue repro)

  1. Without prompt hacks, ask: “Turn off the lights in my office.”

    • Before: pre-tool quip was hidden.
    • After (A): You see/hear the pre-tool line immediately, then the tool executes, then the post-tool confirmation. Logs will show two assistant entries: one content-only, then one with tool_calls. ([GitHub]1)
  2. Optional with (B): Ask “Search the web for X.”

    • Expect first a HassRespond tool call with the short announcement, then the actual search tool/script.

Alternatives considered

  • Prompt-only rule (“never put text before tools”). It helps, but is brittle across vendors/models and doesn’t fix cases where the model still emits pre-tool text. The split in (A) is deterministic and vendor-agnostic.
  • Bigger refactor to mark tools as async/sync and stream around them: great idea, higher scope; (A) is a safe first step and doesn’t preclude future work (the discussion in #139875 also touched on responsiveness and async tools). ([GitHub]1)

Risks & compatibility

  • Very low risk: we only split a single combined message into two in-process entries.
  • If any consumer assumed “content + tools always arrive together”, they still will—just as two consecutive entries; UI already handles multiple assistant turns.

If maintainers are 👍 on the approach, I can tidy this into a PR with a unit test that asserts:

  • Given AssistantContent(content=X, tool_calls=[...]), the chat log gets two entries (content-only, then tools-only), and the tool pipeline behavior remains unchanged.

Refs: #139875 (“LLM Tool calls contain text which is lost and never passed to the UI”). ([GitHub]1)

Generated suggestion by ChatGPT 5

jleinenbach avatar Sep 23 '25 10:09 jleinenbach

Generated suggestion by ChatGPT 5


PR draft: Preserve pre-tool text in Conversation; optionally expose HassRespond for pre-announcements

Fixes: #139875

Summary

When an LLM returns an AssistantContent that contains both content and tool_calls, the preface text is not rendered in UI/TTS; only the post-tool reply is shown. This PR introduces a minimal, low-risk split at the conversation layer so the preface is emitted immediately, then tools run as usual. Optionally, it also exposes HassRespond so models can explicitly speak first (e.g., “Okay, checking online…”) before calling other tools.

What this PR does

  • Conversation fix (A): If a turn includes (content + tool_calls), we split it into:

    1. a content-only assistant message (rendered/streamed immediately), then
    2. a tools-only assistant message to run the tools. This makes pre-tool text reliably visible/audible and plays nicely with streaming TTS.
  • Optional Assist tweak (B): Remove INTENT_RESPOND from the ignore list and add a one-line prompt hint so models can explicitly call HassRespond before other tools. This is optional; (A) already fixes the bug.

Rationale

Prompt-only mitigations (“never put text before tools”) are brittle across vendors/models and don’t fix cases where the model still emits prefaces. The split is deterministic and vendor-agnostic, and it does not change any service/tool-call semantics.

Scope / Backwards-compatibility

  • No API changes.
  • Only affects the specific case where an assistant turn has both content and tool_calls.
  • UI/voice paths immediately benefit (preface shows up/streams before tools run).
  • Tool execution remains unchanged.

Patch (A): split pre-tool text from tools

File: homeassistant/components/conversation/agent_manager.py

diff --git a/homeassistant/components/conversation/agent_manager.py b/homeassistant/components/conversation/agent_manager.py
index 1111111..2222222 100644
--- a/homeassistant/components/conversation/agent_manager.py
+++ b/homeassistant/components/conversation/agent_manager.py
@@
+from dataclasses import replace
@@
+from .chat_log import AssistantContent
@@
+def _split_assistant_content(assistant_content: AssistantContent) -> tuple[AssistantContent | None, AssistantContent]:
+    """Split (content + tool_calls) so preface renders before tools.
+
+    Returns (pre_text_only, tools_part). If no split is needed, pre_text_only is None
+    and tools_part is the original assistant_content.
+    """
+    if assistant_content.content and assistant_content.tool_calls:
+        pre_text_only = replace(assistant_content, tool_calls=None)
+        tools_only = replace(assistant_content, content=None)
+        return pre_text_only, tools_only
+    return None, assistant_content
@@ def _run_assistant_step(...):
-    chat_log.async_add_assistant_content(assistant_content)
+    # Ensure any preface text is shown/spoken *before* tools execute (#139875).
+    pre_text_only, tools_part = _split_assistant_content(assistant_content)
+    if pre_text_only:
+        chat_log.async_add_assistant_content(pre_text_only)
+    chat_log.async_add_assistant_content(tools_part)

Note: If the exact call site differs slightly by branch, apply _split_assistant_content(...) immediately before the combined AssistantContent would be appended/executed.


Patch (B) (optional): allow explicit model pre-announcement via HassRespond

File: homeassistant/helpers/llm.py

diff --git a/homeassistant/helpers/llm.py b/homeassistant/helpers/llm.py
index aaaaaaa..bbbbbbb 100644
--- a/homeassistant/helpers/llm.py
+++ b/homeassistant/helpers/llm.py
@@ class AssistAPI(API):
-        intent.INTENT_RESPOND,
     }
@@ def _async_get_preable(self, llm_context: LLMContext) -> list[str]:
         ]
+        # If you need to announce something before executing another tool,
+        # call HassRespond with a short sentence; do not place free text before a tool call.
+        prompt.append(
+            "If you need to announce something before executing another tool, "
+            "call HassRespond with a short sentence; do not place free text before a tool call."
+        )

This merely exposes an existing built-in intent and adds a hint; it’s safe and complements (A), but is not required for the fix.


Unit tests

New file: tests/components/conversation/test_pre_tool_text.py

# tests/components/conversation/test_pre_tool_text.py
# SPDX-License-Identifier: Apache-2.0 OR MIT
#
# Unit tests for splitting (content + tool_calls) so preface shows before tools.

from homeassistant.components.conversation.chat_log import AssistantContent
from homeassistant.helpers.llm import ToolInput

from homeassistant.components.conversation.agent_manager import (
    _split_assistant_content,
)


def _mk_tool_input() -> ToolInput:
    # Minimal ToolInput; argument shape does not matter for split logic
    return ToolInput(tool_name="HassTurnOff", tool_args={"area": "Office", "domain": ["light"]})


def test_split_assistant_content_splits_when_both_present():
    ac = AssistantContent(
        role="assistant",
        agent_id="conversation.test_agent",
        content="Turning off lights…",
        tool_calls=[_mk_tool_input()],
    )

    pre_text_only, tools_part = _split_assistant_content(ac)

    assert pre_text_only is not None
    assert pre_text_only.content == "Turning off lights…"
    assert pre_text_only.tool_calls is None

    assert tools_part is not None
    assert tools_part.content is None
    assert tools_part.tool_calls and len(tools_part.tool_calls) == 1


def test_split_assistant_content_no_split_when_only_content():
    ac = AssistantContent(
        role="assistant",
        agent_id="conversation.test_agent",
        content="Hello world",
        tool_calls=None,
    )

    pre_text_only, tools_part = _split_assistant_content(ac)

    # No split needed; original object returned as tools_part
    assert pre_text_only is None
    assert tools_part is ac


def test_split_assistant_content_no_split_when_only_tools():
    ac = AssistantContent(
        role="assistant",
        agent_id="conversation.test_agent",
        content=None,
        tool_calls=[_mk_tool_input()],
    )

    pre_text_only, tools_part = _split_assistant_content(ac)

    # No split needed; original object returned as tools_part
    assert pre_text_only is None
    assert tools_part is ac

(Optional) An integration-style test could monkeypatch chat_log.async_add_assistant_content to assert two sequential calls (content-only, then tools-only) for the mixed case, and one call otherwise.


Manual verification checklist

  • [ ] With a model that tends to emit prefaces, ask: “Turn off the lights in my office.” Expected: UI/TTS speaks the preface immediately; tools run; then post-tool reply.
  • [ ] (If B enabled) Ask: “Search the web for X.” Expected: HassRespond pre-announcement, then the actual tool/script call.

Notes for maintainers

  • The conversation-layer split (A) is the functional fix for #139875.
  • The Assist tweak (B) is optional, helps models that choose to explicitly pre-announce via an intent, and does not alter existing flows.

If accepted, I’m happy to adjust filenames/imports to match the exact branch targets and add an integration test as a follow-up.

Generated suggestion by ChatGPT 5

jleinenbach avatar Sep 23 '25 11:09 jleinenbach

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.