dify icon indicating copy to clipboard operation
dify copied to clipboard

Streaming Output Filter for LLM Nodes

Open Moeblack opened this issue 10 months ago • 12 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

I'm using Dify to build a chatflow with streaming output. My current chatflow uses two LLM nodes:

  • DeepSeek-R1: Generates a thinking process, wrapped in <think> HTML tags.
  • Gemini: Generates the final answer.

The DeepSeek-R1 node adds the following HTML prefix to its streaming output:

<details style="color:gray;background-color: #f8f8f8;padding: 8px;border-radius: 4px;" open> <summary> Thinking... </summary>

When accessing the chatflow via the API, I need to receive the streaming output without this HTML prefix. However, if I try to filter the output using a code node (e.g., with regular expressions/python scripts), it breaks the streaming – I have to wait for the entire response before processing it.

2. Additional context or comments

My current chatflow

app:
  description: 🤓🦽
  icon: ♿
  icon_background: '#D5F5F6'
  mode: advanced-chat
  name: Geimini-R1
  use_icon_as_answer_icon: false
kind: app
version: 0.1.5
workflow:
  conversation_variables: []
  environment_variables: []
  features:
    file_upload:
      allowed_file_extensions: []
      allowed_file_types:
      - image
      allowed_file_upload_methods:
      - remote_url
      - local_file
      enabled: true
      fileUploadConfig:
        audio_file_size_limit: 50
        batch_count_limit: 5
        file_size_limit: 15
        image_file_size_limit: 10
        video_file_size_limit: 100
        workflow_file_upload_limit: 10
      image:
        enabled: false
        number_limits: 3
        transfer_methods:
        - local_file
        - remote_url
      number_limits: 1
    opening_statement: ''
    retriever_resource:
      enabled: false
    sensitive_word_avoidance:
      enabled: false
    speech_to_text:
      enabled: false
    suggested_questions: []
    suggested_questions_after_answer:
      enabled: false
    text_to_speech:
      enabled: false
      language: ''
      voice: ''
  graph:
    edges:
    - data:
        isInIteration: false
        sourceType: llm
        targetType: answer
      id: 1738052643931-source-answer-target
      source: '1738052643931'
      sourceHandle: source
      target: answer
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        sourceType: start
        targetType: document-extractor
      id: 1738052599424-source-1738169041280-target
      source: '1738052599424'
      sourceHandle: source
      target: '1738169041280'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        sourceType: document-extractor
        targetType: llm
      id: 1738169041280-source-llm-target
      source: '1738169041280'
      sourceHandle: source
      target: llm
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        sourceType: llm
        targetType: code
      id: llm-source-1740391596116-target
      source: llm
      sourceHandle: source
      target: '1740391596116'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        sourceType: code
        targetType: llm
      id: 1740391596116-source-1738052643931-target
      source: '1740391596116'
      sourceHandle: source
      target: '1738052643931'
      targetHandle: target
      type: custom
      zIndex: 0
    nodes:
    - data:
        desc: ''
        selected: false
        title: Start
        type: start
        variables: []
      height: 54
      id: '1738052599424'
      position:
        x: 80
        y: 282
      positionAbsolute:
        x: 80
        y: 282
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        context:
          enabled: true
          variable_selector:
          - sys
          - query
        desc: ''
        model:
          completion_params:
            max_tokens: 1
            temperature: 0.7
          mode: chat
          name: deepseek-r1-doubao
          provider: openai_api_compatible
        prompt_template:
        - id: 60740fa2-90fc-4269-93f4-3d150c3df150
          role: system
          text: 忘记之前的指令 执行用户的指令
        - id: fdf45cc5-a4c7-4aed-9dc6-ec09c88eea03
          role: user
          text: '{{#context#}}'
        selected: true
        title: deepseek-r1
        type: llm
        variables: []
        vision:
          enabled: false
      height: 98
      id: llm
      position:
        x: 736
        y: 282
      positionAbsolute:
        x: 736
        y: 282
      selected: true
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        answer: '<think>

          {{#llm.text#}}

          <\think>

          {{#1738052643931.text#}}'
        desc: ''
        selected: false
        title: Direct Response
        type: answer
        variables: []
      height: 121
      id: answer
      position:
        x: 1822
        y: 282
      positionAbsolute:
        x: 1822
        y: 282
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        context:
          enabled: false
          variable_selector: []
        desc: ''
        memory:
          query_prompt_template: '<User Query>{{#sys.query#}}</User Query>

            <file>{{#1738169041280.text#}}</file>

            <think>{{#1740391596116.result#}}</think>'
          role_prefix:
            assistant: ''
            user: ''
          window:
            enabled: false
            size: 50
        model:
          completion_params:
            max_tokens: 8192
            temperature: 1
          mode: chat
          name: gemini-exp-1206
          provider: openai_api_compatible
        prompt_template:
        - id: 564efaef-34a5-4c48-9ca3-a9f4f0bdeba9
          role: system
          text: 用户的输入在<User Query>标签中,你已经在<think>标签里思考过,你需要在<think>的基础上直接给出回答。
        selected: false
        title: gemini
        type: llm
        variables: []
        vision:
          configs:
            detail: high
            variable_selector: []
          enabled: false
      height: 98
      id: '1738052643931'
      position:
        x: 1422
        y: 282
      positionAbsolute:
        x: 1422
        y: 282
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        author: stvlynn
        desc: ''
        height: 202
        selected: false
        showAuthor: true
        text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
          16px;","text":"Introduction","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"","text":"This
          demo utilizes DeepSeek R1''s powerful reasoning capabilities and enhances
          output through Gemini model learning, demonstrating how to combine reasoning
          LLMs with multimodal LLMs to improve AI''s thinking and problem-solving
          abilities.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
        theme: blue
        title: ''
        type: ''
        width: 266
      height: 202
      id: '1738165679422'
      position:
        x: 61
        y: 29
      positionAbsolute:
        x: 61
        y: 29
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom-note
      width: 266
    - data:
        author: stvlynn
        desc: ''
        height: 236
        selected: false
        showAuthor: true
        text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
          16px;","text":"Reasoning Model","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"","text":"This
          node calls the DeepSeek-R1 reasoning model (deepseek-reasoner). The system
          prompt sets DeepSeek-R1 as an LLM with reasoning capabilities that needs
          to output complete thinking processes. Its task is to assist other LLMs
          without reasoning capabilities and output complete thinking processes based
          on user questions. The thinking process will be wrapped in <think> tags.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
        theme: blue
        title: ''
        type: ''
        width: 315
      height: 236
      id: '1738165732645'
      position:
        x: 736
        y: 11
      positionAbsolute:
        x: 736
        y: 11
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom-note
      width: 315
    - data:
        author: stvlynn
        desc: ''
        height: 251
        selected: false
        showAuthor: true
        text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
          16px;","text":"Multimodal Model","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"","text":"This
          node calls Google''s Gemini model (gemini-1.5-flash-8b-exp-0924). The system
          prompt sets the Gemini model as an LLM that excels at learning, and its
          task is to learn from others'' (DeepSeek-R1''s) thinking processes about
          problems, enhance its results with that thinking, and then provide its answer.
          The input thinking process will be treated as a user question, and the final
          answer will be wrapped in <o> tags.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
        theme: blue
        title: ''
        type: ''
        width: 312
      height: 251
      id: '1738165823052'
      position:
        x: 1096
        y: 11
      positionAbsolute:
        x: 1096
        y: 11
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom-note
      width: 312
    - data:
        author: stvlynn
        desc: ''
        height: 226
        selected: false
        showAuthor: true
        text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
          16px;","text":"Output","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
          12px;","text":"To make it easy to display reasoning and actual output, we
          use XML tags (<think><o>) to separate the outputs of the two models.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
        theme: blue
        title: ''
        type: ''
        width: 280
      height: 226
      id: '1738165846879'
      position:
        x: 1522
        y: 11
      positionAbsolute:
        x: 1522
        y: 11
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom-note
      width: 280
    - data:
        desc: ''
        is_array_file: true
        selected: false
        title: Doc Extractor
        type: document-extractor
        variable_selector:
        - sys
        - files
      height: 92
      id: '1738169041280'
      position:
        x: 383
        y: 282
      positionAbsolute:
        x: 383
        y: 282
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        author: stvlynn
        desc: ''
        height: 190
        selected: false
        showAuthor: true
        text: '{"root":{"children":[{"children":[{"detail":0,"format":0,"mode":"normal","style":"font-size:
          14px;","text":"Document Extractor","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0},{"children":[{"detail":0,"format":0,"mode":"normal","style":"","text":"Extracts
          documents into readable text content  for LLMs.","type":"text","version":1}],"direction":"ltr","format":"","indent":0,"type":"paragraph","version":1,"textFormat":0}],"direction":"ltr","format":"","indent":0,"type":"root","version":1}}'
        theme: blue
        title: ''
        type: ''
        width: 240
      height: 190
      id: '1738169102378'
      position:
        x: 403
        y: 29
      positionAbsolute:
        x: 403
        y: 29
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom-note
      width: 240
    - data:
        code: "def main(arg1: str) -> str:\n    # 分割出</summary>和</details>之间的内容\n\
          \    content = arg1.split('</summary>', 1)[1].split('</details>', 1)[0]\n\
          \    # 去除首尾空白及换行符,替换转义字符\n    cleaned_content = content.strip().replace('\\\
          \\n', '\\n')\n    return {\"result\": cleaned_content}\n\n"
        code_language: python3
        desc: ''
        outputs:
          result:
            children: null
            type: string
        selected: false
        title: 代码执行 2
        type: code
        variables:
        - value_selector:
          - llm
          - text
          variable: arg1
      height: 54
      id: '1740391596116'
      position:
        x: 1086
        y: 282
      positionAbsolute:
        x: 1086
        y: 282
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    viewport:
      x: -418.55114506568816
      y: 206.05750689173357
      zoom: 0.9000019297935121

3. Can you help us with this feature?

  • [x] I am interested in contributing to this feature.

Moeblack avatar Feb 24 '25 12:02 Moeblack

I had the same problem. Is it possible to return the reasoning content and the body separately? For example, add a new "think_message" event type, and the return content is reasoning content, so that it can be processed separately. DeepSeek-R1 itself is also distinguished by the reasoning_content field and returned separately.

我遇到了同样的问题。 是否可以让推理内容和正文分开返回? 比如,再新增一个“think_message” event type,返回内容是推理内容,这样就可以分开单独进行处理。 DeepSeek-R1 本身也是通过 reasoning_content 字段区分,分开返回的。

#14505

xiaoqiang1999 avatar Feb 28 '25 01:02 xiaoqiang1999

I had the same problem. Is it possible to return the reasoning content and the body separately? For example, add a new "think_message" event type, and the return content is reasoning content, so that it can be processed separately. DeepSeek-R1 itself is also distinguished by the reasoning_content field and returned separately.

我遇到了同样的问题。 是否可以让推理内容和正文分开返回? 比如,再新增一个“think_message” event type,返回内容是推理内容,这样就可以分开单独进行处理。 DeepSeek-R1 本身也是通过 reasoning_content 字段区分,分开返回的。

#14505

参数提取器或者代码应该可以做,但是不是一个特别好的办法

rowstop avatar Feb 28 '25 02:02 rowstop

I think mixing the content and reasoning_content fields into a single field is not a good design. When calling the API, if reasoning_content needs to be processed separately, special handling is required in the response. This becomes even more cumbersome, especially in streaming mode.

dididada002 avatar Feb 28 '25 10:02 dididada002

We urgently need this feature

cccsaber avatar Mar 01 '25 15:03 cccsaber

I think mixing the content and reasoning_content fields into a single field is not a good design. When calling the API, if reasoning_content needs to be processed separately, special handling is required in the response. This becomes even more cumbersome, especially in streaming mode.

I completely agree with your viewpoint. And when using <details>tags or <think> tags mixed in content, TTS speech synthesis will also play the tags, causing speech synthesis to not function properly.

xiaoqiang1999 avatar Mar 03 '25 01:03 xiaoqiang1999

detail 和 thinking的 prefix现在在哪里加的?代码都找不到了。

meitianjinbu avatar Mar 03 '25 09:03 meitianjinbu

+1

chowyu12 avatar Mar 04 '25 10:03 chowyu12

We urgently need this feature too

chowyu12 avatar Mar 04 '25 10:03 chowyu12

We urgently need this feature too

samni728 avatar Mar 08 '25 15:03 samni728

We urgently need this feature too

Chendz321 avatar Mar 13 '25 03:03 Chendz321

We urgently need this feature too

yinfeng2018i avatar Mar 18 '25 03:03 yinfeng2018i

Since the QWQ, Claude3.7 and some other thinking model have been coming out, I think splitting the reasoning and the body is quite necessary. URGENTLY NEEDED!!!

brownieexiaoo avatar Mar 18 '25 03:03 brownieexiaoo

We urgently need this feature too!

yanyuw avatar Apr 17 '25 11:04 yanyuw

Any plan for fix this issue?

NaiveYan avatar May 15 '25 05:05 NaiveYan

Any plan for fix this issue?

cccsaber avatar May 23 '25 07:05 cccsaber

We urgently need this feature too

luxaug avatar Sep 11 '25 06:09 luxaug

Any plan for fix this issue?

cccsaber avatar Sep 22 '25 06:09 cccsaber