nl-kat-coordination icon indicating copy to clipboard operation
nl-kat-coordination copied to clipboard

Add chrome with mitmproxy crawler and report

Open floort opened this issue 1 year ago • 5 comments

floort avatar Oct 09 '23 20:10 floort

Checklist for QA:

  • [x] I have checked out this branch, and successfully ran a fresh make reset.
  • [x] I confirmed that there are no unintended functional regressions in this branch:
    • [x] I have managed to pass the onboarding flow
    • [x] Objects and Findings are created properly
    • [x] Tasks are created and completed properly
  • [ ] I confirmed that the PR's advertised feature or hotfix works as intended.

I just want to say I think it's awesome to see that the work we did for the containerized boefjes and the the new report generation is paying off! :tada:

What works:

  • Boefje and normalizer output looks good
  • Report generation works as intended
  • Tested on a bunch of ugly websites, looks like the retrieved information is correct (plus no crashes)

What doesn't work:

  • Image does not exist yet and requires running docker build -t lapje boefjes/images/lapje. I think we should modify and generalize masscan's workflow before we merge this (shouldn't be to much work).

  • Minor overflow bug on the reports page when dealing with long cookie values: (not a bug introduced by this PR though, we should fix this in a separate PR) image

Bug or feature?:

  • Chrome is currently not pinned. To prevent unexpected regressions like the ones we've seen in the past, we probably should. I don't think this is possible through Google's repository though. Why don't we use Debian's chromium package?

  • From the report page, it is not entirely clear why some records have no values/content, and only a hostname. Some more info (maybe in Lapje's katalogus detail page) about what Lapje does exactly, and how its objects should be interpreted, would be nice. image

Darwinkel avatar Oct 19 '23 08:10 Darwinkel

I renamed the boefje and normalizer.

Image does not exist yet and requires running docker build -t lapje boefjes/images/lapje. I think we should modify and generalize masscan's workflow before we merge this (shouldn't be to much work).

I added the github workflow, but that currently fails because external contributors PRs can't push to the container registry. Should be fixed when the PR is merged.

Minor overflow bug on the reports page when dealing with long cookie values: (not a bug introduced by this PR though, we should fix this in a separate PR)

This has been fixed.

Chrome is currently not pinned. To prevent unexpected regressions like the ones we've seen in the past, we probably should. I don't think this is possible through Google's repository though. Why don't we use Debian's chromium package?

Chrome is used because we are not sure that Debian's chromium doesn't have any modifications that impact the results or might get those in the future. I think getting accurate results using the browser that most people use is more important than the risk of breakage from newer versions.

From the report page, it is not entirely clear why some records have no values/content, and only a hostname. Some more info (maybe in Lapje's katalogus detail page) about what Lapje does exactly, and how its objects should be interpreted, would be nice.

The records have no value because they are external requests that don't set a cookie. The fact that there is an external request is already useful information.

dekkers avatar Oct 31 '23 12:10 dekkers

Multiple pages inside a HAR file can be dealt with in a separate issue: https://github.com/minvws/nl-kat-coordination/issues/1996

underdarknl avatar Nov 02 '23 14:11 underdarknl

Checklist for QA:

  • [x] I have checked out this branch, and successfully ran a fresh make reset.
  • [x] I confirmed that there are no unintended functional regressions in this branch:
    • [x] I have managed to pass the onboarding flow
    • [] Objects and Findings are created properly
    • [x] Tasks are created and completed properly
  • [ ] I confirmed that the PR's advertised feature or hotfix works as intended.

What works:

  • Objects are created.
  • Report can be generated which shows a nice table with all the cookies found and its settings.

What doesn't work:

  • It appears that no Findings are currently being created when cookies are insufficient, e.g. when the Secure or HttpOnly flag is missing. This is something I'd expect as a minimum feature.
  • The output is stored as json/dictionairy-format under Objects. This does not match how other boefjes display their output. (See screenshot below)
  • It does not appear to crawl past the root of an URL, at least not for the various domains I've tried.
  • The explanation in the Katalogus should be improved. Currently the name of this boefje suggests that it crawls web applications for various paths (aka dirbuster-alike), it doesn't mention anything about crawling for cookies. (See screenshot below)

Bug or feature?:

Nothing found yet that hasn't been mentioned before.

Screenshots

mitproxy-objects

mitmproxy-katalogus

stephanie0x00 avatar Nov 27 '23 16:11 stephanie0x00

Lets create a HAR file boefje from this PR, and move the normalisers to a separate issues / PR once we figure out what kind of objects we want in the graph (specifically, how fine grained we want the cookies)

underdarknl avatar Mar 12 '24 12:03 underdarknl