ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: Can’t select ”transformer“ node by variable expression

Open leoliu2031 opened this issue 1 month ago • 5 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

Describe your problem

Image

leoliu2031 avatar Nov 26 '25 11:11 leoliu2031

Not supported yet.

KevinHuSh avatar Nov 26 '25 11:11 KevinHuSh

The ability to select a "transformer" node by variable expression depends on how the UI builds its list of selectable nodes. The dropdown for variable references is populated by walking the graph of preceding nodes, so any transformer output (which publishes a variable, usually named chunks) should be selectable by downstream nodes via variable expressions. You can insert these variables in prompts by typing / or using the UI button, and the output variable name is configurable in the transformer settings docs.

However, a recent update restricts which nodes can be selected as the next step after a transformer: now, only an Indexer (formerly Tokenizer) or another Transformer can follow a Transformer node. This restriction is enforced in the node selection dropdown and may affect how variable expressions resolve or appear in the UI PR #10515.

If you can't select a transformer node by variable expression, check if your pipeline structure matches these new restrictions. If your use case requires a different structure, you may need to adjust the pipeline so that the transformer node is followed only by allowed node types.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other

Chat with Dosu Open in Cursor Join Discord Share on X

dosubot[bot] avatar Nov 26 '25 11:11 dosubot[bot]

Not supported yet.

Thanks for the response! I'd like to elaborate on why this limitation has significant impact on real-world use cases:

1. Data Loss in Chained Extractors

Currently, when multiple Extractors are chained (A → B → C → D), each Extractor can only reference the original Splitter output. This means:

Stage What Happens Result
Extractor A Adds summary field to chunks ✅ Works
Extractor B Gets fresh chunks from Splitter, adds keywords summary is lost
Extractor C Gets fresh chunks from Splitter, adds questions summary + keywords lost
Extractor D Gets fresh chunks from Splitter, adds metadata ❌ All previous fields lost

Final output only contains metadata, not the accumulated fields users expect.

2. Wastes LLM API Calls

Users who want all 4 fields (summary, keywords, questions, metadata) on each chunk have no workaround through the UI. They either:

  • Accept incomplete data (only the last Extractor's field)
  • Manually edit the DSL JSON file (not user-friendly)
  • Use a single Extractor with a complex prompt (loses modularity)

3. Breaks the Visual Pipeline Mental Model

The Canvas UI shows A → B → C → D as a linear flow, implying data passes through each stage. But in reality, B/C/D fetch data from an earlier stage (Splitter), not from their immediate predecessor. This creates a confusing disconnect between visual representation and actual behavior.

leoliu2031 avatar Nov 27 '25 02:11 leoliu2031

It seems this has evolved into a feature request. I would recommend submitting a separate feature request and describing the reasoning behind its necessity.

Magicbook1108 avatar Nov 27 '25 02:11 Magicbook1108

It seems this has evolved into a feature request. I would recommend submitting a separate feature request and describing the reasoning behind its necessity.

done. https://github.com/infiniflow/ragflow/issues/11557

leoliu2031 avatar Nov 27 '25 02:11 leoliu2031