Nadjib
Nadjib
currently the build script rebuilds the graalvm libs often. since these are pretty much static add a way to check if we already built these in the build directory of...
Add both Rust and Python binding functions extract_bytes() to Extractor
Debug why extract to stream is slower than extract to string and improve accordingly
Add functionality to detect the file type, through a Detector struct or just a simple function. I'm not sure how this is best implemented: - Preferably this would be implemented...
Currently upon extraction only the content is returned. Metadata object should be returned with the extraction result describing some details about the file content. - Create wrapper struct for Tika...
To get Windows support: - We need to make sure the Rust core works on Windows - Update the python binding github workflow to build pypi package for Windows
- Process stalls until killed, when running on MacOS with OCR enabled with PDF documents that has embedded images in them. - OCR works fine with direct images but the...