detect-secrets icon indicating copy to clipboard operation
detect-secrets copied to clipboard

Add a plugin for EmailAddress

Open perryzjc opened this issue 1 year ago ā€¢ 11 comments

  • Please check if the PR fulfills these requirements
  • [X] Tests for the changes have been added
  • [X] Docs have been added / updated
  • [x] All CI checks are green
  • What kind of change does this PR introduce?

This PR introduces a new feature in the form of an Email Address Detector.

  • What is the current behavior?

Currently, the system does not have specific functionality that allows for the detection of email addresses.

  • What is the new behavior (if this is a feature change)? With this feature change, the system can now detect email addresses. It uses a regex-based detector to identify valid email addresses and ignores certain whitelist email addresses to reduce false positives. It primarily validates the general format of email addresses but does not adhere strictly to email format standards such as RFC 5322 for the concerns of complexity and efficiency.

  • Does this PR introduce a breaking change?

This PR does not introduce a breaking change. The new detector class EmailAddressDetector is a standalone addition and does not alter or interfere with the existing code base.

  • Other information: In addition to the implementation of the Email Address Detector, this PR also includes comprehensive tests to ensure the correct functionality of the detector. It tests the detector against valid and invalid email addresses, and email addresses that are part of a larger string, as well as whitelist and non-whitelist email addresses.

The code documentation has been updated to reflect these changes. The new feature is expected to enhance the system's capabilities in handling email address-related data.

perryzjc avatar May 16 '23 11:05 perryzjc

FYI @KevinHock - thoughts?

riverma avatar Oct 18 '23 20:10 riverma

Hi @perryzjc, thank you for opening this PR. I noticed our PR checks did not run, so I'll have to figure what happened there first. Rest assured I'll get to this as soon as possible.

lorenzodb1 avatar Oct 18 '23 20:10 lorenzodb1

@perryzjc hi again šŸ˜„ could you please merge master into your branch?

lorenzodb1 avatar Nov 16 '23 00:11 lorenzodb1

@perryzjc hi again šŸ˜„ could you please merge master into your branch?

@lorenzodb1 hi, I've merged the latest updates from the master branch into my feature branch as requested. Please let me know if there's anything else needed for this pull request. Thanks!

perryzjc avatar Nov 16 '23 23:11 perryzjc

@perryzjc looks like there's some pre-commit checks failing, please take a look

lorenzodb1 avatar Nov 17 '23 00:11 lorenzodb1

@perryzjc looks like there's some pre-commit checks failing, please take a look

@lorenzodb1 Thanks for letting me know. I've resolved the issues and all checks are now passing in the workflow.

perryzjc avatar Nov 17 '23 05:11 perryzjc

Do you think we could add a way for someone to specify a domain so that this plugin flags only emails from that domain?

@lorenzodb1 What if we add an additional argument for the command line to specify the domains and make it the config in the baseline file?

  • Add a --detect-only-domains argument for specifying the domains we want to monitor.
  • Update our .secrets.baseline to include these domain settings.

So, instead of the usual: $ detect-secrets scan test_data/ --all-files > .secrets.baseline

We'd have something like: $ detect-secrets scan test_data/ --all-files --detect-only-domains "gmail.com,mail.example.com" > .secrets.baseline

The baseline file would then look like this:

  "plugins_used": [
    ...
    {
      "name": "EmailAddressDetector",
      "detect-only-domains": "gmail.com,mail.example.com"
    },
    ...
  ]

This setup means our pre-commit checks could catch emails based on this configuration, giving us more control.

Iā€™m curious about your take on this. Are there any concerns regarding compatibility or implementation challenges we should be aware of?

perryzjc avatar Nov 19 '23 05:11 perryzjc

@perryzjc that's exactly what I was thinking. I don't think there would be compatibility issues and I can't think of any implementation challenges you might face. I'll tag @jpdakran as he can maybe come up with something

lorenzodb1 avatar Nov 20 '23 19:11 lorenzodb1

Hey @lorenzodb1 @jpdakran - just wanted to poke here as well. Hope @perryzjc contribution can be considered for merge.

riverma avatar Mar 01 '24 22:03 riverma

@perryzjc I just got around to review this and I wondering: do you still intend to add the --detect-only-domains flag? I think that'd be a useful one.

lorenzodb1 avatar Apr 30 '24 16:04 lorenzodb1

@perryzjc looks like merging https://github.com/Yelp/detect-secrets/pull/692 created some conflicts in here. I'd ask you to solve these (which will also have the additional benefit of updating the PR checks, meaning it'd run checks for py3.11 too)

lorenzodb1 avatar Apr 30 '24 16:04 lorenzodb1