open-webui icon indicating copy to clipboard operation
open-webui copied to clipboard

WIP: docling integration

Open MichaelKarpe opened this issue 3 weeks ago • 2 comments

Changelog Entry

Description

These are very minimal changes to test docling for experimental purposes, not ready to be released yet. The goal is just to highlight what could be done for integrating docling for contributors who would have time for further testing and implementation. I will not have enough time in the coming days or weeks to build further, please do feel free to build upon this.

  • Approach 1 : using langchain_docling as in the first commit of this PR. I made it work but it used the CPU and not the GPU, leading to unreasonable processing times (a few minutes) for hundreds of pages PDFs. Proposed improvements:

    • find how to use GPU when available
    • warn for processing times for long documents in case of CPU only and/or fallback to default engine
    • handle full offline mode by integrating models into the library, see https://github.com/DS4SD/docling/issues/326
  • Approach 2 : using docker image as Docling team has been recently active in docling-serve (see https://github.com/DS4SD/docling-serve/pkgs/container/docling-serve) and we can probably expect soon a docling docker image as for Apache Tika. Then Ctrl+F of Tika usage in Open WebUI repo and replicate the logic for Docling. Here also to pay attention to CPU/GPU considerations.

Both approaches could possibly be implemented so that the user can choose between both options depending on their needs.

Added

  • [List any new features, functionalities, or additions]

Changed

  • [List any changes, updates, refactorings, or optimizations]

Deprecated

  • [List any deprecated functionality or features that have been removed]

Removed

  • [List any removed features, files, or functionalities]

Fixed

  • [List any fixes, corrections, or bug fixes]

Security

  • [List any new or updated security-related changes, including vulnerability fixes]

Breaking Changes

  • BREAKING CHANGE: [List any breaking changes affecting compatibility or functionality]

Additional Information

  • [Insert any additional context, notes, or explanations for the changes]
    • [Reference any related issues, commits, or other relevant information]

Screenshots or Videos

  • [Attach any relevant screenshots or videos demonstrating the changes]

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions and describe your changes before submitting a pull request.

Before submitting, make sure you've checked the following:

  • [ ] Target branch: Please verify that the pull request targets the dev branch.
  • [ ] Description: Provide a concise description of the changes made in this pull request.
  • [ ] Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • [ ] Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
  • [ ] Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • [ ] Testing: Have you written and run sufficient tests for validating the changes?
  • [ ] Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • [ ] Prefix: To cleary categorize this pull request, prefix the pull request title, using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

MichaelKarpe avatar Feb 02 '25 13:02 MichaelKarpe