django-DefectDojo icon indicating copy to clipboard operation
django-DefectDojo copied to clipboard

Re-importing the same report leaves the duplicates in status mitigated

Open ptrovatelli opened this issue 3 years ago • 6 comments

Bug description When we import a report which includes itself duplicates, when re-importing the same report, the duplicates get mitigated.

  • first import:
    • 1 active, verified
    • 1 inactive, duplicate
  • after reimport:
    • 1 active, verified
    • 1 inactive, mitigated, duplicate

It doesn't seem correct to me. I'd expect to have :

  • 1 active, verified
  • 1 inactive, duplicate (same status as before the reimport as we have just re-imported the same report)

The problem is that when matching the new findings to the existing findings, we always match the new findings to the same single original finding (the one that's not duplicate). Consequently, the duplicates from the original report are flagged as mitigated because no new finding was matched against them.

i think the issue is here (serializers.py and test/views.py)

                if findings:
                    # existing finding found
                    finding = findings[0]

Intead of working on the first finding that matches, we should work on all of them (might need a bit of tuning in order not to re-save the same findings multiples times...)

Sample data: with checkmarx parser attached

Steps to reproduce Steps to reproduce the behavior:

  • Import the report once
  • Re-import the same report Expected behavior The status after the re-import is the same as after the initial import

Deployment method (select with an X)

  • [ ] Kubernetes
  • [x ] Docker
  • [ ] setup.bash / legacy-setup.bash

Environment information

  • DefectDojo Commit Message:
$ git show -s --format="[%ci] %h: %s [%d]"
[2021-02-26 17:14:50 +0100] 4f789bc7: fix flake8, fix Q queries, add  unit tests [ (HEAD -> reimport-configurable-dedupe, myOrigin/reimport-configurable-dedupe)]

Sample scan files (optional) See attached checkmarx_duplicate_in_same_report.zip

Screenshots (optional)

Console logs (optional)

Additional context (optional) I've found this while working on https://github.com/DefectDojo/django-DefectDojo/pull/3753 and the provided report will produce the issue after this is merged. The problem was present before PR 3753 but the test data provided might not replicate it.

ptrovatelli avatar Feb 28 '21 21:02 ptrovatelli

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 11 '21 01:06 stale[bot]

It looks like this was fixed via https://github.com/DefectDojo/django-DefectDojo/pull/3753

devGregA avatar Jul 02 '21 20:07 devGregA

I was under the impression this was fixed, but may not be @valentijnscholten could you please confirm?

devGregA avatar Jul 02 '21 21:07 devGregA

This still happens when an import contains duplicates, i.e. 3 findings getting the same hash_code. On the next reimport of that report, there is no way to uniquely match the incoming 3 findings against the 3 existing one. So the 3 incoming findings will all get matched against the 1st existing finding with the same hash_code. This finding will remain open. The other 2 will be marked as mitigated as it appears they are no longer present in the report.

Usually it means the deduplication algorithm is not 100% correct for the scanner, or the parser doesn't do a good job around dupe_key deduplication. But still I think we should try to improve the situation as it appears very confusing currently. In the past we had matching on title, cve, cwe, etc. Maybe we should fall back to those if there are multiple existing findings with the same hash_code.

I think it's quite complicated for a 'good first issue' as one needs to fully understand all the details of import/reimport/hash_code, hope you don't mind me removing that label again.

valentijnscholten avatar Jul 03 '21 07:07 valentijnscholten

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 16 '22 10:04 stale[bot]

don't think it was fixed

ptrovatelli avatar Apr 29 '22 08:04 ptrovatelli

Any updates on this issue? Still annoying in 2024

bmihaescu avatar Jan 17 '24 17:01 bmihaescu