tijoseymathew

Results 4 comments of tijoseymathew

I tested on two machines with the following config and experienced Segmentation fault ```{shell} $ lscpu | grep Architecture Architecture: x86_64 $ lsb_release -a No LSB modules are available. Distributor...

@brownsloth I had previously manged to map sentence level provenance in https://github.com/google/langextract/issues/184#issue-3349438594 ( check pdf_extract.py )

**Update:** I've published [`langextract-docling`](https://github.com/tijoseymathew/langextract-docling) — a wrapper that adds **PDF support** to LangExtract. It works as a drop-in replacement for `lx.extract(...)`, with support for PDF files (local or URLs). ```python...

Hi @aksg87 , I wasn’t sure if there was enough interest earlier, so I didn’t expand the library further. Based on this thread I’ve created a new branch `feat/provenance` that...