2ms icon indicating copy to clipboard operation
2ms copied to clipboard

History - Group by Secret

Open yrachelevi opened this issue 1 year ago • 1 comments

Today when a secret is found in several versions/ history, it returns different results (all under the same ID) - we would like to group these results and add the data of the versions where the secret has been found

The added data is:

  • count of the occurrences in the history (how many versions are included)
  • First and last version (version identifier and date)

yrachelevi avatar Nov 01 '23 23:11 yrachelevi

Technical Details

Generally work with versions

Today we extract the content from the source and move each content forward to the detector engine. To consider the history, which means that multiple contents are related, there are two options that I can see now:

  1. Change our attitude and consider a document with all its history versions are a connected block to scan.
  2. Revise the results, where different versions are combined under the same result ID, and extract the versions info from there.

I think we should choose the 1st option, from the engineering perspective, to declare and control the versions (and their order) and not send them to the other side of the process and consider them as a source of truth.

The --history flag

All the discussion above is relevant when the --history is enabled. What should we do when the --history is omitted?

Plugins

Assume we will now work with a bunch of versions of the same document, it will be challenging for some plugins.

Git

With Git, we are not reading, using, and scanning the whole version, but we take only the diff. How should we treat a deleting line or deleting file version? How will the Detector know the secret was removed?

Action Items

  • [ ] I will start with Confluence and with changing the process to scan a bunch of versions.

baruchiro avatar Feb 29 '24 16:02 baruchiro