open-webui icon indicating copy to clipboard operation
open-webui copied to clipboard

feat: uploading files without backend processing

Open wangjiyang opened this issue 7 months ago β€’ 19 comments
trafficstars

Check Existing Issues

  • [x] I have searched the existing issues and discussions.

Problem Description

I am trying to use open webui as a frontend webui, and trying to process user reuqest via chating ui, which open webui has a great user experience and works stable. Thanks for your great work.

We are trying to add more features to make good use of our self-hosted aigc backend. This feature is something like: user upload some files and ask some question", and then our backend (which is driven by pipeline) get original file and processing it and send sse events to pipeline and then emit it to open webui front end. RAG feature or OCR feature(which i don't know) is always trying to extract data into text, which is not a must for our use case. This takes a lot of time and affect user experience. Also, it seems open webui always trying to send raw text to open webui backend, it's also not required by our feature and affect user experience.

So I am requesting one feature that:

  1. bypass content extraction feature completely (I already check "bypass vector database" inside document tab, but seems not disabled completely)
  2. pipeline can access user uploaded file via raw url(of course respecting cors setting), so it can do more stuff inside.

Could you please have a discussion of this feature, thanks.

Desired Solution you'd like

  1. bypass content extraction feature completely (I already check "bypass vector database" inside document tab, but seems not disabled completely)
  2. pipeline can access user uploaded file via raw url(of course respecting cors setting), so it can do more stuff inside.

Alternatives Considered

No response

Additional Context

No response

wangjiyang avatar Mar 31 '25 09:03 wangjiyang

Yes. This is needed. Bypass Embedding and Retrieval option does not work for this purpose - I think the chunking, embedding happens when the file is uploaded.

harrywang avatar Apr 15 '25 00:04 harrywang

It would be great if that file can be passed to the LLM using the files API, similar to OpenAI. This way you can pass text files, images, PDFs directly to the LLM.

roeizavida avatar Apr 15 '25 10:04 roeizavida

@harrywang Chunking, and embedding does NOT happen when the file is uploaded if you have Bypass Embedding and Retrieval enabled.

tjbck avatar Apr 19 '25 09:04 tjbck

A great usecase for this would also be to upload audio files directly to the AI model

For example: upload a audio file directly to a gemini model, so that it can differentiate the different people speaking in the audio and create a comprehensive transcription, complete with person identification.

If this would be possible that'd be cool

Classic298 avatar Apr 19 '25 12:04 Classic298

We've been trying to implement models in Open WebUI that use tools to operate on files (instead of passing files directly to the AI model). These tools could be a code execution sandbox, a custom backend for file processing, or a remote API where we want the AI to upload the files to (like creating a Jira issue with attachment). If I understand this issue correctly, this is more or less about the same feature.

Ideal for us would be an option that:

  • Makes the name or ids of uploaded files available in the model prompt so that the model can operate on the "name" level.
  • Makes the content available to tools (already works for built-in Tools because they have access to storage, but not for Tool Servers).
  • Does not pass the contents of the file to the model in order to avoid confusing the model.
  • And in particular, it does not treat images differently. At the moment, images are distinct from files and don't have a filename, for instance.
  • Configurable per model because we want to host "normal" models that do pass files to the API on the same Open WebUI instance.

We've already tried a few workarounds with Tools and Functions, but these are not powerful enough.

tremlin avatar Apr 19 '25 15:04 tremlin

@harrywang Chunking, and embedding does NOT happen when the file is uploaded if you have Bypass Embedding and Retrieval enabled.

You are right, the following is AI's analysis of the code base for what happens once a file is uploaded in chat. I hope we can disable file upload based on file type for each model.

πŸ“ File Upload Process Flow (Open WebUI)

1. Frontend Initiates Upload

  • When a user uploads a file in the chat interface, the frontend component (typically MessageInput.svelte) calls the uploadFileHandler function.
  • This function uses the uploadFile API from $lib/apis/files.
  • A FormData object is created, the file is appended, and a POST request is sent to /files/ with the user's authentication token.

2. Backend Receives the File

  • The request is handled by the upload_file endpoint in /backend/open_webui/routers/files.py.
  • The file is received as an UploadFile object via FastAPI.

3. File Processing and Storage

  • A unique ID (UUID) is generated for the file.
  • The original filename is preserved but prefixed with the UUID.
  • The file is stored using one of several storage providers:
    • LocalStorageProvider – local filesystem
    • S3StorageProvider – Amazon S3
    • GCSStorageProvider – Google Cloud Storage
    • AzureStorageProvider – Azure Blob Storage

4. Database Entry Creation

  • A database entry is created with metadata:
    • File ID, name, path
    • Content type
    • File size
    • Uploading user ID

5. Content Extraction and Processing

  • If process=true, the system attempts to extract content:
    • Audio files (.mp3, .wav, .ogg, .m4a): Transcribed to text
    • Other files (excluding images): Processed via a content extraction engine
  • Extracted text is stored in the file’s data field.

6. Vector Database Integration

  • Unless BYPASS_EMBEDDING_AND_RETRIEVAL is enabled:
    • Extracted content is converted into embeddings.
    • Embeddings are stored in a vector database.
    • Each file gets its own collection named: file-{file_id}.

7. Response to Frontend

  • The backend returns:
    • File ID and metadata
    • Extracted content (if applicable)
    • Any processing errors

8. Frontend Updates UI

  • The frontend receives the response and:
    • Updates the UI to display the uploaded file.
    • Enables the file to be referenced in chat or used for RAG (retrieval-augmented generation).

πŸ” Special File Type Handling

File Type Behavior
Audio files Automatically transcribed to text
Images Stored, not processed for text extraction
Documents/PDFs Text extracted via engine (Tika, Docling, etc.)

This process enables rich file handling and integration into AI interactions, especially for RAG-based workflows.

harrywang avatar Apr 19 '25 16:04 harrywang

We've been trying to implement models in Open WebUI that use tools to operate on files (instead of passing files directly to the AI model). These tools could be a code execution sandbox, a custom backend for file processing, or a remote API where we want the AI to upload the files to (like creating a Jira issue with attachment). If I understand this issue correctly, this is more or less about the same feature.

Ideal for us would be an option that:

  • Makes the name or ids of uploaded files available in the model prompt so that the model can operate on the "name" level.
  • Makes the content available to tools (already works for built-in Tools because they have access to storage, but not for Tool Servers).
  • Does not pass the contents of the file to the model in order to avoid confusing the model.
  • And in particular, it does not treat images differently. At the moment, images are distinct from files and don't have a filename, for instance.
  • Configurable per model because we want to host "normal" models that do pass files to the API on the same Open WebUI instance.

We've already tried a few workarounds with Tools and Functions, but these are not powerful enough.

Yes. See my comments above. We are hardcoding some conditions to treat images and other files (txt, csv) differently but it would be nice if this can be configured via Admin UI.

harrywang avatar Apr 19 '25 16:04 harrywang

Yes. See my comments above. We are hardcoding some conditions to treat images and other files (txt, csv) differently but it would be nice if this can be configured via Admin UI.

I'd suggest making this a provider/model switch instead of a global switch. With less capable models, the existing processing is great, but with Gemini I want to pass all files everything as-is.

mjp0 avatar Apr 20 '25 05:04 mjp0

Yes. See my comments above. We are hardcoding some conditions to treat images and other files (txt, csv) differently but it would be nice if this can be configured via Admin UI.

I'd suggest making this a provider/model switch instead of a global switch. With less capable models, the existing processing is great, but with Gemini I want to pass all files everything as-is.

yes, introducing a webui configuration for different file types is a good idea, and this switch can be configured per model would be better.

wangjiyang avatar Apr 24 '25 03:04 wangjiyang

The option of a complete bypass of any processing on a per model (or per chat) basis would be great. For example, I would like to upload PDFs as-is to Gemini 2.5 via API, due to its excellent PDF processing capability.

matjbru avatar May 07 '25 09:05 matjbru

For those looking to process files using a custom pipeline, you can fetch the file in the pipe method from the OpenWebUI backend using the code below. The awkward part of is that you have to intercept the file and add it to the message history before passing it to your model. There is no record of the file in the conversation history maintained by OpenWebUI, which makes it impossible to build an agentic system that uses the internal conversation history and can reason on top of files that were uploaded earlier in the conversation. It works very well for files pass in the latest invokation of the pipe method.... i.e. the last user message.

TLDR: Please record the presence of files to the conversation history, not just in the body, because the body only contains information for the latest file that was uploaded, and not for the files that were uploaded earlier in that conversation.


async def inlet(self, body: dict, user: dict) -> dict:
    body["file_contents"] = body.get("files", [])

def pipe(...........)

....
            if body["file_contents"]:
                last_message_content = messages[-1]["content"]
                if type(last_message_content) == str:
                    last_message_content = [
                        {"type": "text", "text": last_message_content}
                    ]

                for f in body["file_contents"]:

                    url = self.config.OPENWEBUI_URL + f["url"] + "/content"
                    print(f"fetching files from {url}")
                    response = requests.get(
                        url,
                        headers={
                            "Authorization": f"Bearer {self.config.OPENWEBUI_API_KEY}"
                        },
                    )

                    if type(response.content) == bytes and response.status_code == 200:
                        last_message_content.append(
                            {
                                "type": "file",
                                "file": {
                                    "filename": f["file"]["filename"],
                                    "file_data": f"data:{f['file']['meta']['content_type']};base64,{base64.b64encode(response.content).decode('utf-8')}",
                                },
                            }
                        )
                    else:
                        print("Response from fetching files did not return bytes")

                messages[-1]["content"] = last_message_content

istranic avatar May 29 '25 20:05 istranic

What is the status of this issue ? Is it scheduled to be implemented ? Is help needed ?

MonsieurBibo avatar Jul 03 '25 13:07 MonsieurBibo

Image Even turn on the Bypass Embedding and Retrieval, but it still works. and parsing pdf(uploading) to text into system prompt.

@istranic can you kindly provide this code to pipe markerplace. can it support when uploading file to openwebui then store the file into local folder.

Below is ChatGPT recommended import os import time import json import base64 import requests from typing import Optional, Callable, Awaitable, Any, Dict from pydantic import BaseModel, Field

Define a simple config-like class to store environment variables

class Config: OPENWEBUI_URL: str = os.getenv("OPENWEBUI_URL", "http://localhost:3000") OPENWEBUI_API_KEY: str = os.getenv("OPENWEBUI_API_KEY", "") SAVE_FOLDER: str = os.getenv("UPLOAD_SAVE_DIR", "data/uploads")

class Pipe: class Valves(BaseModel): ENABLE_STATUS_INDICATOR: bool = Field(default=True) EMIT_INTERVAL: float = Field(default=2.0)

def __init__(self):
    self.name = "OpenWebUI Save Local Agent"
    self.valves = self.Valves()
    self.last_emit_time = 0
    self.config = Config()

async def emit_status(
    self,
    __event_emitter__: Callable[[dict], Awaitable[None]],
    level: str,
    message: str,
    done: bool,
):
    current_time = time.time()
    if (
        __event_emitter__
        and self.valves.ENABLE_STATUS_INDICATOR
        and (
            current_time - self.last_emit_time >= self.valves.EMIT_INTERVAL or done
        )
    ):
        await __event_emitter__({
            "type": "status",
            "data": {
                "status": "complete" if done else "in_progress",
                "level": level,
                "description": message,
                "done": done,
            },
        })
        self.last_emit_time = current_time

async def pipe(
    self,
    body: dict,
    __user__: Optional[dict] = None,
    __event_emitter__: Optional[Callable[[dict], Awaitable[None]]] = None,
    __event_call__: Optional[Callable[[dict], Awaitable[dict]]] = None,
) -> Optional[dict]:

    messages = body.get("messages", [])
    if not messages:
        await self.emit_status(__event_emitter__, "error", "No messages found", True)
        return {"error": "No messages found in request body"}

    body["file_contents"] = body.get("files", [])
    saved_files = []

    if body["file_contents"]:
        last_message_content = messages[-1]["content"]
        if isinstance(last_message_content, str):
            last_message_content = [{"type": "text", "text": last_message_content}]

        for f in body["file_contents"]:
            url = self.config.OPENWEBUI_URL + f["url"] + "/content"
            print(f"Fetching files from {url}")
            response = requests.get(
                url,
                headers={"Authorization": f"Bearer {self.config.OPENWEBUI_API_KEY}"},
            )

            if isinstance(response.content, bytes) and response.status_code == 200:
                file_name = f["file"]["filename"]
                content_type = f["file"]["meta"]["content_type"]

                save_path = os.path.join(self.config.SAVE_FOLDER, file_name)
                os.makedirs(os.path.dirname(save_path), exist_ok=True)

                with open(save_path, "wb") as fp:
                    fp.write(response.content)

                print(f"Saved file to {save_path}")
                saved_files.append({"filename": file_name, "path": save_path})
            else:
                print("Failed to fetch file or non-bytes content")

        messages[-1]["content"] = last_message_content

    await self.emit_status(__event_emitter__, "info", "File saved locally", True)
    return {"saved_files": saved_files}

TigerAI-TW avatar Jul 06 '25 21:07 TigerAI-TW

found the work around solution, when file uploads, it store into OpenWebUI/uploads We can parse the information to other tool, and execute the following process.

TigerAI-TW avatar Jul 06 '25 22:07 TigerAI-TW

the problem is openwebui still tried to inject the file contents with the RAG prompt and it doesnt disable it completely.

thinkrivan avatar Jul 07 '25 11:07 thinkrivan

+1

JonasWild avatar Jul 14 '25 15:07 JonasWild

I think there should be an option to pass the file name and path to the LLM in context. Tools like python-docx could easily take advantage of this. I had to make a convoluted work around for my template editor that uses the API to retrieve the proper file names, and build the path to the document. It would have been way easier if this was automatically passed to the llm when it was attached.

https://gitlab.com/gmod-central/owui-word-doc-template-editor

GmodCentral avatar Jul 15 '25 18:07 GmodCentral

thanks guys, I would also love this feature! We use custom APIs as Tool Servers and some endpoints expect a binary file as parameter. What is the status of this issue ?

kevinxschulz avatar Aug 29 '25 08:08 kevinxschulz

Hey is there any update on this , i want to disable embeddings just for csv and excel files , and keep for other file types , Bypass embeddings prevents it from being uploaded completely , saying 401: unsupported file type , is there any workaround to do so

parthraut45 avatar Oct 24 '25 07:10 parthraut45