Nadjib

Results 9 issues of Nadjib

currently the build script rebuilds the graalvm libs often. since these are pretty much static add a way to check if we already built these in the build directory of...

enhancement

Add both Rust and Python binding functions extract_bytes() to Extractor

enhancement
good first issue

Debug why extract to stream is slower than extract to string and improve accordingly

enhancement

Add functionality to detect the file type, through a Detector struct or just a simple function. I'm not sure how this is best implemented: - Preferably this would be implemented...

enhancement

Currently upon extraction only the content is returned. Metadata object should be returned with the extraction result describing some details about the file content. - Create wrapper struct for Tika...

enhancement

To get Windows support: - We need to make sure the Rust core works on Windows - Update the python binding github workflow to build pypi package for Windows

enhancement
good first issue

- Process stalls until killed, when running on MacOS with OCR enabled with PDF documents that has embedded images in them. - OCR works fine with direct images but the...

bug