bulk_extractor
bulk_extractor copied to clipboard
create Python scanner
Create a scanner that runs multiple sub-pythons with an API that allows analysis and calling the feature reporter.
Possible use cases:
-
Write a Python script that scans for configuration files for a specific program. It would of course be faster if it were C++, but it should be trivial to hook up to bulk_extractor with a Python API, and would be useful then for scanning both executables and memory images. Many people can't hack the C++
-
Send encoded data off to Apache Tika for text extraction, and then scan the extracted text as sbufs themselves; especially useful if you had somewhat larger block sizes and could recognize MS Office files and PDFs.