SEDATED icon indicating copy to clipboard operation
SEDATED copied to clipboard

5sec Timeout - Performance Issue

Open sagarvsh opened this issue 4 years ago • 8 comments

Hi Team,

We have implemented SEDATED in our Organization and it's been very successful. Thank you for this. One challenge we are facing is on the performance, many times, even with medium sized changes in the push or when we push the changes to Master branch, we easily end up with 5sec rule timeout. For now, we are whitelisting the repo till the push is complete and revert it.

Any recommendations or better work around to deal with this issue, as the current process is slowing down or blocking the development process even when there is no sensitive data in the code.

Thank you Sagar

sagarvsh avatar Apr 21 '20 18:04 sagarvsh

@sagarvsh Thanks for reaching out, yes, we have actually ran into similar issues from time to time as well with this when devs aren't following certain best practices or when they aren't committing coding regularly.

One thing to mention right away is squash merging. If you squash merge then this causes SEDATED to rescan all the changes in the commits again. If possible, avoid squash merging, or if you require squashing then maybe do it more often. By doing this, it reduces the amount of content that is scanned in by SEDATED in that 5 second window.

Here are some other suggestions to watch out for especially for repos you repeatedly have issues with:

  1. Make sure devs are not committing compiled code (/bin for example) to git as this is unnecessary, doesn't follow best practices, and just slows things down overall.

  2. Avoid committing: minified files, jars, zips, binary files (images/pdfs), node modules folder, etc... as all these files/folders are considered part of the "built artifact" and should not really be stored in git in the first place.

Another option to help avoid the 5 second issue, one of the changes we intentionally made was how we are handling whitelisted commits. Previously, if a commit was whitelisted we still ran the grep command (expensive) on the contents of that commit. Now though, in the lasted version of the code we no longer grep if the commit is whitelisted. Here is what that means. If you push 5 huge commits and they are all whitelisted, you will not hit the 5 second window because we don't bother interrogating the huge content in your 5 commits. With that said, an alternative approach to repo whitelisting, is to just do normal commit level whitelisting on the pushes that contain a bunch of code.

SimeonCloutier avatar Apr 21 '20 19:04 SimeonCloutier

@sagarvsh glad to hear you are using SEDATED! 😊 We are working on updating the regexes to help improve the performance and accuracy as well and make it less often that devs will hit that 5s timeout.

Just curious, what types of files are being pushed and/or lines of code in the commit(s) that is causing the push to hit the 5s timeout? images/pdfs/zips/bundle.js? We may be able to optimize if we can find a common culprit.

d3jk avatar Apr 21 '20 19:04 d3jk

Thank you @SimeonCloutier This really helps. I will test this out and update back.

sagarvsh avatar Apr 21 '20 20:04 sagarvsh

@denniskennedy We are working very hard to clear out the sensitive data from our code base and this tool is a boon to prevent new sensitive data from getting in.

Here is one example where we have 5s rule issue. 323 commits, 141 files changed (130 of them are .js) and 17k lines modified (10k lines added).

sagarvsh avatar Apr 21 '20 20:04 sagarvsh

@sagarvsh In this example you mentioned, is a developer building up 17k modified lines of code (I would assume over the course of many months) and then many months later pushing this up to github? If not, can you explain the the flow of pushing/pulling/merging the code that is occurring in this example?

SimeonCloutier avatar Apr 22 '20 11:04 SimeonCloutier

@SimeonCloutier I think I was found the issue.

  1. This is a really big repo with many developers. Developers fork the repo and develop and then push the changes to parent repo.
  2. I used to remove the whitelisted commit id's after the push is completed in their local. This may have added load on the process when merging from forked repo to parent repo.

I will start collecting more metrics which might help.

sagarvsh avatar Apr 22 '20 22:04 sagarvsh

@sagarvsh FYI, we just released a new version of SEDATED®, with lots of improvements (see below).

https://github.com/OWASP/SEDATED/releases/tag/v1.2.0

SimeonCloutier avatar Jun 17 '20 17:06 SimeonCloutier

Perfect, let me try it today and let you know.

sagarvsh avatar Jun 17 '20 21:06 sagarvsh