sdk-python icon indicating copy to clipboard operation
sdk-python copied to clipboard

[BUG] handle_redact_content ignores redactUserContentMessage, causing incorrect block messages for input guardrails

Open KiyotakaMatsushita opened this issue 6 days ago • 0 comments

Problem Summary

When using AWS Bedrock Guardrails with separate custom messages for input and output blocking, the handle_redact_content function in src/strands/event_loop/streaming.py only processes redactAssistantContentMessage and completely ignores redactUserContentMessage. This causes the output guardrail block message to be displayed even when the input guardrail is triggered.

Impact

Users see incorrect block messages, which negatively affects user experience and can cause confusion about why content was blocked.


Current Behavior

When an input guardrail is triggered, the response shows:

  • Actual: Output guardrail block message is displayed
  • Expected: Input guardrail block message should be displayed

Current Implementation

File: src/strands/event_loop/streaming.py (around line 385-392 in main branch)

def handle_redact_content(event: RedactContentEvent, state: dict[str, Any]) -> None:
    """Handles redacting content from the input or output.

    Args:
        event: Redact Content Event.
        state: The current state of message processing.
    """
    if event.get("redactAssistantContentMessage") is not None:
        state["message"]["content"] = [{"text": event["redactAssistantContentMessage"]}]
    # ❌ redactUserContentMessage is completely ignored

Steps to Reproduce

  1. Set up AWS Bedrock Guardrail with:

    • Input blocking enabled with custom message A
    • Output blocking enabled with custom message B
    • Different keywords for input vs output blocking
  2. Configure Strands agent with:

bedrock_model_kwargs = {
    "guardrail_redact_input": True,
    "guardrail_redact_input_message": "Input blocked message",
    "guardrail_redact_output": True,
    "guardrail_redact_output_message": "Output blocked message",
}
  1. Send a request that triggers the input guardrail

  2. Observe the response shows "Output blocked message" instead of "Input blocked message"


Expected Behavior

The function should check both redactUserContentMessage (for input blocks) and redactAssistantContentMessage (for output blocks), and apply the appropriate message.


Proposed Fix

def handle_redact_content(event: RedactContentEvent, state: dict[str, Any]) -> None:
    """Handles redacting content from the input or output.

    Args:
        event: Redact Content Event.
        state: The current state of message processing.
    """
    # Check input redaction first
    if event.get("redactUserContentMessage") is not None:
        state["message"]["content"] = [{"text": event["redactUserContentMessage"]}]
    # Check output redaction
    elif event.get("redactAssistantContentMessage") is not None:
        state["message"]["content"] = [{"text": event["redactAssistantContentMessage"]}]

Additional Context

Inconsistency within the codebase

Interestingly, src/strands/agent/agent.py correctly handles redactUserContentMessage:

# agent.py correctly processes redactUserContentMessage
if (
    isinstance(event, ModelStreamChunkEvent)
    and event.chunk
    and event.chunk.get("redactContent")
    and event.chunk["redactContent"].get("redactUserContentMessage")
):
    self.messages[-1]["content"] = self._redact_user_content(
        self.messages[-1]["content"], 
        str(event.chunk["redactContent"]["redactUserContentMessage"])
    )

This shows that:

  1. redactUserContentMessage is a valid and expected field
  2. There's an inconsistency between agent.py (correct) and streaming.py (incorrect)

AWS Bedrock Event Structure

When an input guardrail is triggered, AWS Bedrock returns:

{
  "redactContent": {
    "redactUserContentMessage": "Input blocked message",
    "redactAssistantContentMessage": "Output blocked message"
  }
}

Both fields are present in the event, but streaming.py only checks for redactAssistantContentMessage.


Environment

  • strands-agents version: Latest (checked main branch on 2025-12-13)
  • Python version: 3.13
  • AWS Bedrock: Guardrails with separate input/output custom messages
  • Affected file: src/strands/event_loop/streaming.py

Verification

I've verified that:

  • ✅ The bug exists in the current main branch (as of 2025-12-13)
  • ✅ No existing issue reports this problem (searched via GitHub CLI)
  • ✅ Related but different issues (#1075, #1077) have been closed
  • ✅ Tests exist in the codebase that reference redactUserContentMessage

Related Issues

  • #1075 - Different issue about guardrails_trace settings (closed)
  • #1077 - Different issue about tool output redaction (closed)

KiyotakaMatsushita avatar Dec 13 '25 11:12 KiyotakaMatsushita