Ma Mana ma Manama

Results 206 comments of Ma Mana ma Manama

Fixed it: ``` mkdir -p downloads/ cd downloads git clone https://github.com/rhasspy/pipe mkdir build cd build ``` then e.g. my fancy: `rm CMakeCache.txt & echo "Starting compilation 👩‍💻 proper..." && mkdir...

Update, solved also other probs, see: https://github.com/rhasspy/piper/issues/814#issuecomment-3046847047

Update, oh, I see it may be not that easy. The source PDF is weird: ``` pdfinfo ~/Downloads/szewcy2_3.1_pl.pdf Creator: PDFium Producer: PDFium CreationDate: Sat Nov 2 20:11:38 2024 CET Custom...

Ok, it will not be that easy, methinks, as these CIDs are to blame: ``` ~/Downloads/pdfcpu_0.9.1_Linux_x86_64$ docling-parse -p ~/Downloads/szewcy2_3.1_pl.pdf | head [02-11-2024 10:13:26 WARNING] /project/src/v1/proj_folders/pdf_library/qpdf/parser/cid_cmap.h:168 could not find file for...

I have quickly coded your advice into python, below, but same results - the Polish engine does not engage: ``` import sys from pathlib import Path from docling.document_converter import DocumentConverter,...

On mobile (Termux) here - I will be "testing aloud", writing as I go. I am doing this: 1. Update docling: ` Collecting docling Downloading docling-2.8.0-py3-none-any.whl.metadata (7.2 kB), Collecting docling-parse=2.0.5...

3. continued: We have: ``` Environment at local 🎋 prooted system: Linux localhost 6.2.1-PRoot-Distro #1 SMP PREEMPT Thu Mar 17 16:28:22 CST 2022 aarch64 GNU/Linux ``` and working: ``` Docling...

4. I have switched to a regular Ubuntu (and PC) to speed things up, as it had the model downloaded. The same docling versions as in Point 3, so we...

I may have found the culprit - actual code (mis)uses easyOCR as engine if JPG, see https://github.com/DS4SD/docling/issues/505 . Ugly solution: print JPG as PDF and then try (works).

Over many a morning coffee, I have written an extensive hack to overcome it in that version, here: https://github.com/Manamama/Ubuntu_Scripts_1/blob/main/docling_me.sh