leakdb icon indicating copy to clipboard operation
leakdb copied to clipboard

"source" field in normalized JSON?

Open darrenmartyn opened this issue 4 years ago • 5 comments

Would it be feasible to add a "source" field to the JSON/indexed data, so you could "tag" entries as being from certain leaks.

This could be very useful when trying to go back later and attribute where a piece of data came from - but unsure if it would have performance impacts?

darrenmartyn avatar Jul 08 '20 10:07 darrenmartyn

I don't think it would have much of an impact on performance, most of the code operates on lines not the actual content of the line, so there's little code that would need to change too. A few other folks have been asking for something like this so I'll probably look at adding it. It would affect the bloom filter's ability to effectively de-duplicate identical user/password combos since they'd be from different sources, so there'd could be a modest impact to index/sort times but i don't think there'd be a large impact to search times.

moloch-- avatar Jul 08 '20 12:07 moloch--

Any news on this feature request?

aaronkaplan avatar Feb 15 '21 22:02 aaronkaplan

Not had time to work on it yet sorry!

moloch-- avatar Feb 16 '21 00:02 moloch--

On 16.02.2021, at 01:25, Joe [email protected] wrote:

Not had time to work on it yet sorry!

No worries, just wanted to figure out what the status ist. What would be needed ? I.e. is it simple enough as a non-go coder to add it?

Best, a.

aaronkaplan avatar Feb 16 '21 00:02 aaronkaplan

Maybe, most of the code only cares about "lines" in a file, you'd have to extend the normalizer to add a "source" field to the JSON format, and extend the few parts of the code that parse the JSON to optionally deal with the extra field.

moloch-- avatar Feb 16 '21 00:02 moloch--