extractous
extractous copied to clipboard
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Hello! This is my first time contributing to a public repository, and it’s also my first time using Rust. I hope you find the changes satisfactory. I found your repository...
currently the build script rebuilds the graalvm libs often. since these are pretty much static add a way to check if we already built these in the build directory of...
Add both Rust and Python binding functions extract_bytes() to Extractor
Debug why extract to stream is slower than extract to string and improve accordingly
Add functionality to detect the file type, through a Detector struct or just a simple function. I'm not sure how this is best implemented: - Preferably this would be implemented...
Currently upon extraction only the content is returned. Metadata object should be returned with the extraction result describing some details about the file content. - Create wrapper struct for Tika...
To get Windows support: - We need to make sure the Rust core works on Windows - Update the python binding github workflow to build pypi package for Windows