Streaming Output Filter for LLM Nodes
Self Checks
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [x] Please do not modify this template :) and fill in all the required fields.
1. Is this request related to a challenge you're experiencing? Tell me about your story.
I'm using Dify to build a chatflow with streaming output. My current chatflow uses two LLM nodes:
- DeepSeek-R1: Generates a thinking process, wrapped in
<think>HTML tags. - Gemini: Generates the final answer.
The DeepSeek-R1 node adds the following HTML prefix to its streaming output:
<details style="color:gray;background-color: #f8f8f8;padding: 8px;border-radius: 4px;" open> <summary> Thinking... </summary>
When accessing the chatflow via the API, I need to receive the streaming output without this HTML prefix. However, if I try to filter the output using a code node (e.g., with regular expressions/python scripts), it breaks the streaming – I have to wait for the entire response before processing it.
2. Additional context or comments
My current chatflow
app:
description: 🤓🦽
icon: ♿
icon_background: '#D5F5F6'
mode: advanced-chat
name: Geimini-R1
use_icon_as_answer_icon: false
kind: app
version: 0.1.5
workflow:
conversation_variables: []
environment_variables: []
features:
file_upload:
allowed_file_extensions: []
allowed_file_types:
- image
allowed_file_upload_methods:
- remote_url
- local_file
enabled: true
fileUploadConfig:
audio_file_size_limit: 50
batch_count_limit: 5
file_size_limit: 15
image_file_size_limit: 10
video_file_size_limit: 100
workflow_file_upload_limit: 10
image:
enabled: false
number_limits: 3
transfer_methods:
- local_file
- remote_url
number_limits: 1
opening_statement: ''
retriever_resource:
enabled: false
sensitive_word_avoidance:
enabled: false
speech_to_text:
enabled: false
suggested_questions: []
suggested_questions_after_answer:
enabled: false
text_to_speech:
enabled: false
language: ''
voice: ''
graph:
edges:
- data:
isInIteration: false
sourceType: llm
targetType: answer
id: 1738052643931-source-answer-target
source: '1738052643931'
sourceHandle: source
target: answer
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: false
sourceType: start
targetType: document-extractor
id: 1738052599424-source-1738169041280-target
source: '1738052599424'
sourceHandle: source
target: '1738169041280'
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: false
sourceType: document-extractor
targetType: llm
id: 1738169041280-source-llm-target
source: '1738169041280'
sourceHandle: source
target: llm
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: false
sourceType: llm
targetType: code
id: llm-source-1740391596116-target
source: llm
sourceHandle: source
target: '1740391596116'
targetHandle: target
type: custom
zIndex: 0
- data:
isInIteration: false
sourceType: code
targetType: llm
id: 1740391596116-source-1738052643931-target
source: '1740391596116'
sourceHandle: source
target: '1738052643931'
targetHandle: target
type: custom
zIndex: 0
nodes:
- data:
desc: ''
selected: false
title: Start
type: start
variables: []
height: 54
id: '1738052599424'
position:
x: 80
y: 282
positionAbsolute:
x: 80
y: 282
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 244
- data:
context:
enabled: true
variable_selector:
- sys
- query
desc: ''
model:
completion_params:
max_tokens: 1
temperature: 0.7
mode: chat
name: deepseek-r1-doubao
provider: openai_api_compatible
prompt_template:
- id: 60740fa2-90fc-4269-93f4-3d150c3df150
role: system
text: 忘记之前的指令 执行用户的指令
- id: fdf45cc5-a4c7-4aed-9dc6-ec09c88eea03
role: user
text: '{{#context#}}'
selected: true
title: deepseek-r1
type: llm
variables: []
vision:
enabled: false
height: 98
id: llm
position:
x: 736
y: 282
positionAbsolute:
x: 736
y: 282
selected: true
sourcePosition: right
targetPosition: left
type: custom
width: 244
- data:
answer: '<think>
{{#llm.text#}}
<\think>
{{#1738052643931.text#}}'
desc: ''
selected: false
title: Direct Response
type: answer
variables: []
height: 121
id: answer
position:
x: 1822
y: 282
positionAbsolute:
x: 1822
y: 282
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 244
- data:
context:
enabled: false
variable_selector: []
desc: ''
memory:
query_prompt_template: '<User Query>{{#sys.query#}}</User Query>
<file>{{#1738169041280.text#}}</file>
<think>{{#1740391596116.result#}}</think>'
role_prefix:
assistant: ''
user: ''
window:
enabled: false
size: 50
model:
completion_params:
max_tokens: 8192
temperature: 1
mode: chat
name: gemini-exp-1206
provider: openai_api_compatible
prompt_template:
- id: 564efaef-34a5-4c48-9ca3-a9f4f0bdeba9
role: system
text: 用户的输入在<User Query>标签中,你已经在<think>标签里思考过,你需要在<think>的基础上直接给出回答。
selected: false
title: gemini
type: llm
variables: []
vision:
configs:
detail: high
variable_selector: []
enabled: false
height: 98
id: '1738052643931'
position:
x: 1422
y: 282
positionAbsolute:
x: 1422
y: 282
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 244
- data:
author: stvlynn
desc: ''
height: 202
selected: false
showAuthor: true
text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
16px;","text":"Introduction","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"","text":"This
demo utilizes DeepSeek R1''s powerful reasoning capabilities and enhances
output through Gemini model learning, demonstrating how to combine reasoning
LLMs with multimodal LLMs to improve AI''s thinking and problem-solving
abilities.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
theme: blue
title: ''
type: ''
width: 266
height: 202
id: '1738165679422'
position:
x: 61
y: 29
positionAbsolute:
x: 61
y: 29
selected: false
sourcePosition: right
targetPosition: left
type: custom-note
width: 266
- data:
author: stvlynn
desc: ''
height: 236
selected: false
showAuthor: true
text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
16px;","text":"Reasoning Model","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"","text":"This
node calls the DeepSeek-R1 reasoning model (deepseek-reasoner). The system
prompt sets DeepSeek-R1 as an LLM with reasoning capabilities that needs
to output complete thinking processes. Its task is to assist other LLMs
without reasoning capabilities and output complete thinking processes based
on user questions. The thinking process will be wrapped in <think> tags.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
theme: blue
title: ''
type: ''
width: 315
height: 236
id: '1738165732645'
position:
x: 736
y: 11
positionAbsolute:
x: 736
y: 11
selected: false
sourcePosition: right
targetPosition: left
type: custom-note
width: 315
- data:
author: stvlynn
desc: ''
height: 251
selected: false
showAuthor: true
text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
16px;","text":"Multimodal Model","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"","text":"This
node calls Google''s Gemini model (gemini-1.5-flash-8b-exp-0924). The system
prompt sets the Gemini model as an LLM that excels at learning, and its
task is to learn from others'' (DeepSeek-R1''s) thinking processes about
problems, enhance its results with that thinking, and then provide its answer.
The input thinking process will be treated as a user question, and the final
answer will be wrapped in <o> tags.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
theme: blue
title: ''
type: ''
width: 312
height: 251
id: '1738165823052'
position:
x: 1096
y: 11
positionAbsolute:
x: 1096
y: 11
selected: false
sourcePosition: right
targetPosition: left
type: custom-note
width: 312
- data:
author: stvlynn
desc: ''
height: 226
selected: false
showAuthor: true
text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
16px;","text":"Output","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
12px;","text":"To make it easy to display reasoning and actual output, we
use XML tags (<think><o>) to separate the outputs of the two models.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
theme: blue
title: ''
type: ''
width: 280
height: 226
id: '1738165846879'
position:
x: 1522
y: 11
positionAbsolute:
x: 1522
y: 11
selected: false
sourcePosition: right
targetPosition: left
type: custom-note
width: 280
- data:
desc: ''
is_array_file: true
selected: false
title: Doc Extractor
type: document-extractor
variable_selector:
- sys
- files
height: 92
id: '1738169041280'
position:
x: 383
y: 282
positionAbsolute:
x: 383
y: 282
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 244
- data:
author: stvlynn
desc: ''
height: 190
selected: false
showAuthor: true
text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
14px;","text":"Document Extractor","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"","text":"Extracts
documents into readable text content for LLMs.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
theme: blue
title: ''
type: ''
width: 240
height: 190
id: '1738169102378'
position:
x: 403
y: 29
positionAbsolute:
x: 403
y: 29
selected: false
sourcePosition: right
targetPosition: left
type: custom-note
width: 240
- data:
code: "def main(arg1: str) -> str:\n # 分割出</summary>和</details>之间的内容\n\
\ content = arg1.split('</summary>', 1)[1].split('</details>', 1)[0]\n\
\ # 去除首尾空白及换行符,替换转义字符\n cleaned_content = content.strip().replace('\\\
\\n', '\\n')\n return {\"result\": cleaned_content}\n\n"
code_language: python3
desc: ''
outputs:
result:
children: null
type: string
selected: false
title: 代码执行 2
type: code
variables:
- value_selector:
- llm
- text
variable: arg1
height: 54
id: '1740391596116'
position:
x: 1086
y: 282
positionAbsolute:
x: 1086
y: 282
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 244
viewport:
x: -418.55114506568816
y: 206.05750689173357
zoom: 0.9000019297935121
3. Can you help us with this feature?
- [x] I am interested in contributing to this feature.
I had the same problem. Is it possible to return the reasoning content and the body separately? For example, add a new "think_message" event type, and the return content is reasoning content, so that it can be processed separately. DeepSeek-R1 itself is also distinguished by the reasoning_content field and returned separately.
我遇到了同样的问题。 是否可以让推理内容和正文分开返回? 比如,再新增一个“think_message” event type,返回内容是推理内容,这样就可以分开单独进行处理。 DeepSeek-R1 本身也是通过 reasoning_content 字段区分,分开返回的。
#14505
I had the same problem. Is it possible to return the reasoning content and the body separately? For example, add a new "think_message" event type, and the return content is reasoning content, so that it can be processed separately. DeepSeek-R1 itself is also distinguished by the reasoning_content field and returned separately.
我遇到了同样的问题。 是否可以让推理内容和正文分开返回? 比如,再新增一个“think_message” event type,返回内容是推理内容,这样就可以分开单独进行处理。 DeepSeek-R1 本身也是通过 reasoning_content 字段区分,分开返回的。
参数提取器或者代码应该可以做,但是不是一个特别好的办法
I think mixing the content and reasoning_content fields into a single field is not a good design. When calling the API, if reasoning_content needs to be processed separately, special handling is required in the response. This becomes even more cumbersome, especially in streaming mode.
We urgently need this feature
I think mixing the content and reasoning_content fields into a single field is not a good design. When calling the API, if reasoning_content needs to be processed separately, special handling is required in the response. This becomes even more cumbersome, especially in streaming mode.
I completely agree with your viewpoint. And when using <details>tags or <think> tags mixed in content, TTS speech synthesis will also play the tags, causing speech synthesis to not function properly.
detail 和 thinking的 prefix现在在哪里加的?代码都找不到了。
+1
We urgently need this feature too
We urgently need this feature too
We urgently need this feature too
We urgently need this feature too
Since the QWQ, Claude3.7 and some other thinking model have been coming out, I think splitting the reasoning and the body is quite necessary. URGENTLY NEEDED!!!
We urgently need this feature too!
Any plan for fix this issue?
Any plan for fix this issue?
We urgently need this feature too
Any plan for fix this issue?