datawave
datawave copied to clipboard
Implement SSDeep based document retrieval
In #2085 we implemented the ability to retrieve similar SSDeep hashes for a query hash using an ngram index.
For this issue, extend that capability so that we can retrieve similar documents for an SSDeep hash query. This will involve two retrieval passes - the first to find similar SSDeep hashes for a query SSDeep hash, the second to find the documents / metadata that contain those SSDeep hashes. The response should look like a regular document query, returning a list of the fields for each retrieved document.
Conceptually, this could be similar to the /lookupContentUUID
API call.