pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

Return citations from pipelines

Open nitinkr0411 opened this issue 1 year ago • 11 comments
trafficstars

How do we return citations from pipelines?

Currently, I'm implementing a rag pipeline and showing citation by markdown formatting which seems hacky. Is there a way we can citations like OpenWebUI inbuilt RAG?

nitinkr0411 avatar Aug 17 '24 07:08 nitinkr0411

@tjbck Is this already available? Or is this a new feature?

nitinkr0411 avatar Aug 22 '24 02:08 nitinkr0411

I'm also interested, theres also discussion #156 with >10 upvotes 🤔

ndrsfel avatar Aug 31 '24 13:08 ndrsfel

Any new about this?

nkostop avatar Nov 22 '24 16:11 nkostop

any update on this Issue ? is it possible to show citation not from internal RAG

gesaleh avatar Jan 13 '25 16:01 gesaleh

any update on possible support for citations from external RAG , or llm response ? by using the same tags as used by OWUI

<source_context></source_context><source></source><source_id></source_id>

gesaleh avatar Jan 21 '25 11:01 gesaleh

Any update on this?

jayanthkmr avatar Apr 13 '25 19:04 jayanthkmr

You can now send citations via events!

tjbck avatar Apr 13 '25 19:04 tjbck

You can now send citations via events!

Is there any example please

gesaleh avatar Apr 13 '25 19:04 gesaleh

I think this is the example https://github.com/open-webui/pipelines/blob/main/examples/pipelines/events_pipeline.py

Bschim avatar Apr 30 '25 12:04 Bschim

For people still struggling: I got it working like this.

   def pipe(
        self, user_message: str, model_id: str, messages: List[dict], body: dict
    ) -> Union[str, Generator, Iterator]:

        yield "The answer."

        yield {
            "event": {
                "type": "citation",
                "data": {
                   "document": ["Content of document"],
                   "metadata": [{"source": "test.txt"}],
                   "source": {"name": "test.txt", "url": "https://example.com/test.txt"},
                },
            }
        }

Or with a streaming response, e.g. from llamaindex:

    def pipe(
        self, user_message: str, model_id: str, messages: List[dict], body: dict
    ) -> Union[str, Generator, Iterator]:

        query_engine = self.index.as_query_engine(streaming=True)
        response = query_engine.query(user_message)

        for node in response.source_nodes:
            yield {
                "event": {
                    "type": "citation",
                    "data": {
                       "document": [node.text],
                       "metadata": [{"source": node.metadata['file_name']}],
                       "source": {"name": node.metadata['file_name'], "url": node.metadata["file_path"]},
                    },
                }
            }

        for text in response.response_gen:
            yield text

Details about the possible events in the documentation about event emitters.

djmaze avatar May 13 '25 23:05 djmaze

Turns out normal text also has to be output as events. Otherwise the event handling process seems to break.

So outputting text needs to be done like this:

        for text in response.response_gen:
            yield {"event": {"type": "message", "data": {"content": text}}}

djmaze avatar May 14 '25 10:05 djmaze

Is there a list of supported event types?

tobiasge avatar Jul 17 '25 14:07 tobiasge

Is there a list of supported event types?

https://openwebui.com/features/plugin/events/#-full-list-of-event-types

pelag0s avatar Jul 17 '25 14:07 pelag0s

The link is actually https://docs.openwebui.com/features/plugin/events/#-full-list-of-event-types

pkeffect avatar Jul 17 '25 22:07 pkeffect

For people still struggling: I got it working like this.

def pipe( self, user_message: str, model_id: str, messages: List[dict], body: dict ) -> Union[str, Generator, Iterator]:

    yield "The answer."

    yield {
        "event": {
            "type": "citation",
            "data": {
               "document": ["Content of document"],
               "metadata": [{"source": "test.txt"}],
               "source": {"name": "test.txt", "url": "https://example.com/test.txt"},
            },
        }
    }

Or with a streaming response, e.g. from llamaindex:

def pipe(
    self, user_message: str, model_id: str, messages: List[dict], body: dict
) -> Union[str, Generator, Iterator]:

    query_engine = self.index.as_query_engine(streaming=True)
    response = query_engine.query(user_message)

    for node in response.source_nodes:
        yield {
            "event": {
                "type": "citation",
                "data": {
                   "document": [node.text],
                   "metadata": [{"source": node.metadata['file_name']}],
                   "source": {"name": node.metadata['file_name'], "url": node.metadata["file_path"]},
                },
            }
        }

    for text in response.response_gen:
        yield text

Details about the possible events in the documentation about event emitters.

If one wants to show the relevance score as well (e.g. in a custom RAG pipeline), just add a distances object to the dictionary of type list[float] (Source). It will then render like the built-in RAG pipeline that uses the OpenWebUI Knowledge Base:

Image

fynrup avatar Aug 09 '25 11:08 fynrup

For people still struggling: I got it working like this.

def pipe( self, user_message: str, model_id: str, messages: List[dict], body: dict ) -> Union[str, Generator, Iterator]:

    yield "The answer."

    yield {
        "event": {
            "type": "citation",
            "data": {
               "document": ["Content of document"],
               "metadata": [{"source": "test.txt"}],
               "source": {"name": "test.txt", "url": "https://example.com/test.txt"},
            },
        }
    }

Or with a streaming response, e.g. from llamaindex:

def pipe(
    self, user_message: str, model_id: str, messages: List[dict], body: dict
) -> Union[str, Generator, Iterator]:

    query_engine = self.index.as_query_engine(streaming=True)
    response = query_engine.query(user_message)

    for node in response.source_nodes:
        yield {
            "event": {
                "type": "citation",
                "data": {
                   "document": [node.text],
                   "metadata": [{"source": node.metadata['file_name']}],
                   "source": {"name": node.metadata['file_name'], "url": node.metadata["file_path"]},
                },
            }
        }

    for text in response.response_gen:
        yield text

Details about the possible events in the documentation about event emitters.

why follow_ups and tags lost,related event?

jianwenlei avatar Sep 09 '25 08:09 jianwenlei

Openwebui uses the pipe functions for generating the titles, follow-ups, tags etc.. I haven't found a way to change this. As a workaround, what I'm doing is rerouting these requests back to a different model:

class Pipe:
    class Valves(BaseModel):
        TASK_MODEL: str = Field(default="", description="What model to use to perform tasks (title, follow-up and tag generation)")

    async def pipe(
        self,
        body: dict,
        __user__: dict,
        __request__: Request,
        __event_emitter__: Callable[[dict], Any] = None,
        __event_call__: Callable[[dict], Any] = None,
        __task__: str | None = None
    ):
        # openwebui will use this pipe function also for tasks (generating tags, follow-ups and titles)
        # so we reroute that to use another model instead
        if __task__ is not None:
            user = Users.get_user_by_id(__user__["id"])
            body["model"] = self.valves.TASK_MODEL

            res = await generate_chat_completion(__request__, body, user)
            res = res["choices"][0]["message"]["content"]
            yield res
            return
    
        # your pipe logic ...

GeorgSchenzel avatar Sep 09 '25 15:09 GeorgSchenzel

Openwebui uses the pipe functions for generating the titles, follow-ups, tags etc.. I haven't found a way to change this. As a workaround, what I'm doing is rerouting these requests back to a different model:

class Pipe:
    class Valves(BaseModel):
        TASK_MODEL: str = Field(default="", description="What model to use to perform tasks (title, follow-up and tag generation)")

    async def pipe(
        self,
        body: dict,
        __user__: dict,
        __request__: Request,
        __event_emitter__: Callable[[dict], Any] = None,
        __event_call__: Callable[[dict], Any] = None,
        __task__: str | None = None
    ):
        # openwebui will use this pipe function also for tasks (generating tags, follow-ups and titles)
        # so we reroute that to use another model instead
        if __task__ is not None:
            user = Users.get_user_by_id(__user__["id"])
            body["model"] = self.valves.TASK_MODEL

            res = await generate_chat_completion(__request__, body, user)
            res = res["choices"][0]["message"]["content"]
            yield res
            return
    
        # your pipe logic ...

IIRC you can specify which background tasks go to what model under admin panel -> interface. Another alternative is to detect the tasks in your pipe and if you see one you simply return prematurely.

Maleya avatar Sep 09 '25 15:09 Maleya