docling icon indicating copy to clipboard operation
docling copied to clipboard

Make EasyOCR optional dependency

Open jaluma opened this issue 1 year ago • 0 comments

Requested Feature

It would be beneficial to make OCR models optional during installation, with EasyOCR remaining as the default option. In our case, we use TesseractOCR but are required to install EasyOCR since it's currently mandatory, even though we don't use it.

Here's a proposed installation approach:

  1. All OCR models: pip install docling[all]

  2. EasyOCR only (default installation): pip install docling[easyocr]

  3. Specific OCR models: pip install docling[tesseract]

  4. Base installation (no OCR models): pip install docling

Alternatives

  1. Install Docling as is - This installs EasyOCR and its dependencies even when they're not needed.
  2. Install Docling without dependencies - This requires significant maintenance effort on our end to ensure version compatibility.

jaluma avatar Dec 23 '24 12:12 jaluma