[BUG] litellm.BadRequestError: PerplexityException – Assumption: Consecutive User Messages in Strix Agent
Description:
litellm.BadRequestError: PerplexityException - After the (optional) system message(s), user or tool message(s) should alternate with assistant message(s). My assumption is that this issue could be related to Strix injecting an "agent identity" block as a user message before forwarding the actual user prompt. This might violate Perplexity's expected alternating message structure between user and assistant messages, but I'm not entirely certain this is the root cause.
To Reproduce:
-
Install Strix:
pip install strix-agent -
Ensure Docker and Python 3.12+ are available.
-
Export LLM settings:
export STRIX_LLM="perplexity/sonar-reasoning" export LLM_API_KEY="ppx_your_key" export LLM_API_BASE="https://api.perplexity.ai" -
Run any Strix scan, e.g.:
strix --target ./app-directory -
Observe the error in the CLI output:
- The run aborts with the Perplexity/LiteLLM alternating-message error, preventing agents from executing.
Expected Behavior:
Strix should be able to interoperate with Perplexity models without requiring manual patching. If my assumption about the cause is correct, this issue could potentially be resolved by either tagging the injected identity block as a system message or by allowing users to disable it. This would ensure the request complies with Perplexity’s conversation ordering rules.
System Information:
- OS: macOS 26.0.1
- Strix Version: v0.3.1
- Python Version: 3.11.9
- LLM Used: perplexity/sonar-reasoning (LiteLLM provider)
Additional Context:
The issue may be related to how the message is constructed in strix/llm/llm.py::_build_identity_message, which currently returns:
{"role": "user", ...}
Perplexity’s API expects the conversation history to alternate between user and assistant messages, starting with a user message (the instruction passed to Strix). If this assumption is correct, the identity block could be causing a violation of this alternating contract. A temporary workaround to resolve this issue locally would be to change the block’s role to system or gate it behind an environment flag (STRIX_DISABLE_AGENT_IDENTITY).
Suggested Solution:
- If my assumption is accurate, one potential solution could be to modify
strix/llm/llm.pyto allow users to toggle the "identity block" behavior via an environment variable or by marking it as a system message by default. - Alternatively, the message creation logic could be updated to ensure it respects Perplexity’s expected message structure.
my setup is generating this error message at the llm inference server, which may be related.
srv log_server_r: request: POST /chat/completions 172.17.0.1 200
got exception: {"code":500,"message":"Cannot have 2 or more assistant messages at the end of the list.","type":"server_error"}
this is llama.cpp and gpt-oss-120b.