findcdn icon indicating copy to clipboard operation
findcdn copied to clipboard

DRAFT: Findcdn CDN Engine Massive Rework (v1.0.0)

Open Pascal-0x90 opened this issue 3 years ago โ€ข 1 comments

๐Ÿ—ฃ Description

findcdn's old CDNEngine was clunky and took forever to run. An immense rework has been undertaken to merge in a lot of the ides from:

  • https://github.com/cisagov/findcdn/pull/53
    • https://github.com/cisagov/findcdn/issues/42
  • https://github.com/cisagov/findcdn/pull/47
  • https://github.com/cisagov/findcdn/pull/34
  • https://github.com/cisagov/findcdn/pull/16
  • https://github.com/cisagov/findcdn/issues/43
  • https://github.com/cisagov/findcdn/issues/51
  • https://github.com/cisagov/findcdn/issues/50
  • https://github.com/cisagov/findcdn/issues/21
  • https://github.com/cisagov/findcdn/issues/20

To rework the entire code-base and allow the following key points:

  • More extendable and modular analyzers.
  • Using requests library instead of urllib.
  • No confusing Chef -> DomainPot -> .... relationships.
  • Argument based parameters to disable different analyses to give you the choice on which analysis to run and not run.
  • Improvement in scanning performance by not requiring the need for all data but just when the first sight of a CDN is found.
    • Also, stop analysis on domain if IP could not be resolved (dead domain avoidance)
  • Handle improper domains more gracefully by adding it to standard output for user to grab.
  • Give total runtime in output to help with timing performance.

๐Ÿ’ญ Motivation and context

Main goal: make findcdn better, faster, stronger. ๐Ÿ˜Ž

Findcdn has been untouched for a while now without any significant improvements. Thankfully to people such as @Yablargo, @kz0ltan, @MC189, and @mcdonnnj for making issues and pull requests to help improve the quality and longevity of this repo.

Main Issues

  • Hard to extend with new analyzers.
  • Code was hard to develop on due to complexity in class relations.
  • Analyzing even one domain was super slow.

๐Ÿงช Testing

View initial testing in https://github.com/cisagov/findcdn/issues/43#issuecomment-1327983721

Test creation is still in progress.

  • [ ] Test creation completed (in progress)

โœ… Pre-approval checklist

  • [ ] This PR has an informative and human-readable title.
  • [ ] Changes are limited to a single goal - eschew scope creep!
  • [ ] All future TODOs are captured in issues, which are referenced in code comments.
  • [ ] All relevant type-of-change labels have been added.
  • [ ] I have read the CONTRIBUTING document.
  • [ ] These code changes follow cisagov code standards.
  • [ ] All relevant repo and/or project documentation has been updated to reflect the changes in this PR.
  • [ ] Tests have been added and/or modified to cover the changes in this PR.
  • [ ] All new and existing tests pass.

โœ… Pre-merge checklist

  • [ ] Revert dependencies to default branches.
  • [ ] Finalize version.

โœ… Post-merge checklist

  • [ ] Add a tag or create a release.

Pascal-0x90 avatar Nov 27 '22 00:11 Pascal-0x90

This pull request introduces 2 alerts when merging 7cb5c879a6e80635c4015113eec73994f869cf30 into 7bbf50e4a1cf1f32999c1abdcabc064d8ebf53f8 - view on LGTM.com

new alerts:

  • 2 for Explicit export is not defined

Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down โป completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine :gear: that powers LGTM.com. For more information, please check out our post on the GitHub blog.

lgtm-com[bot] avatar Nov 27 '22 00:11 lgtm-com[bot]