vouch-proxy icon indicating copy to clipboard operation
vouch-proxy copied to clipboard

Added regexWhiteList feature.

Open danielewood opened this issue 4 years ago • 4 comments

I wanted this to be able to have fine grained control over the whitelist (and to be able to whitelist users as well as domains/subdomains).

If wanted, I'll do another PR later to add a regexBlackList that overrides everything, even domains.

Changes/Additions:

  • handlers.go: Add regexWhiteList conditional to VerifyUser.
    • Triggers after whitelist, so it will always be overridden by whitelist.
  • cfg.go: Add start-up check to make sure all the regex expressions at least compile
  • handlers_test.go: Add tests using handler_regexwhitelist.yml
  • handler_regexwhitelist.yml: email is a test1?\@(domain|example)\.com, which will always be a valid match for [email protected]
  • config.yml_example:
  # regexWhiteList - (optional) allows only the listed usernames
  # Note: whiteList always takes precidence and disables regexWhiteList
  #
  # Single-quotes(') are to prevent the yaml parser from misinterpreting the line)
  # usernames are usually email addresses (google, most oidc providers) or login/username for github and github enterprise
  
  regexWhiteList:
 - '^bob[4-9][email protected]$'
 - '^alice@.+\.com$'
 - '^alice@your(other)?domain\.com$'
 - '^joe@yourdomain\.com$'
 - '^j[aneo]{1,3}@yourdomain\.(org|net|com)$'

danielewood avatar Apr 23 '20 03:04 danielewood

@danielewood thanks much for the contribution to VP and the effort. Always appreciated.

I keep on looking at VerifyUser and wonder where that really should be headed. There's confusion (even within the VP posse) over the role of domains/allowAllUsers/claims. I'm inclined to clarify that before adding new verification paths.

In general, before adding config elements and features to VP, particularly authz stuff we try to see if it can be handled by either the IdP or Nginx. Have you considered openresty for this?

Can you please clarify the itch you're scratching? What IdP do you use? Is this about domains?

You're saying this is a regex whitelist but I'm not seeing anything to prevent a negative assertion. Have you tested a negative regex?

Thanks again.

bnfinet avatar Apr 24 '20 21:04 bnfinet

In general, before adding config elements and features to VP, particularly authz stuff we try to see if it can be handled by either the IdP or Nginx. Have you considered openresty for this?

I have, but since my goal is not to run logic on different claims values by group or other attributes, it seems that trying it with OpenResty is a sledgehammer for a finish nail. After that I have the issue of it complicating the deployment of vouch as there is no prebuilt OpenResty container.

Can you please clarify the itch you're scratching? What IdP do you use? Is this about domains?

The specific itch I am scratching is to have a more flexible whitelist that includes different domains.

With vouch-proxy, as it currently works with Google as the IdP, if you make the Oauth External, you can now validate any Google Identity associated with an email address (Gmail or GSuite). The limitation, is that once whitelist is enabled, you need to know every individual email address in advance, when I would prefer to be able to whitelist a domain or maybe only a few individuals within a domain. Regex gives me the power and control to accomplish the flexibility I desire. To be very clear, here is what one of my configs looks like:

vouch:
  logLevel: debug

  domains:
  - mylabdomain.com

  whiteList:
  - [email protected]
  - [email protected]
  - [email protected]
  - [email protected]
  - [email protected]
  - [email protected]
  - [email protected]
  - [email protected]
  - [email protected]
  - [email protected]
  - [email protected]

oauth:
  provider: google
  # get credentials from...
  # https://console.developers.google.com/apis/credentials
  client_id: xxxxxxxxxxxxxx.apps.googleusercontent.com
  client_secret: xxxxxxxxxxxxxx
  callback_urls:
    - https://vouch.mylabdomain.com/auth

What all these emails have in common is that they all use GMail/GSuite for email. The domain mylabdomain.com doesnt have an IdP at all, and it doesnt need it. It just needed proof of ownership to be added to the Google OAuth Callback URL. I need no cooperation whatsoever from the owner of school.edu because they are already using GSuite, so they can use the Google IdP and pass through their validated email (thus, I could whitelist an entire external org, if I wanted).

With regexWhitelist this can become:

  whiteList:
  - '^dwood@gmail\.com$'
  - '^dwood@work-domain\.com$'
  - '^dwood@school-domain\.edu$'
  - '^bob.smith@gmail\.com$'
  - '^.+@family-domain\.com$'
  - '^.+@friends-domain-1\.com$'
  - '^.+@friends-domain-2\.com$'
  - '^.+@friends-domain-3\.com$'

You're saying this is a regex whitelist but I'm not seeing anything to prevent a negative assertion. Have you tested a negative regex?

There is nothing to prevent a negative assertion, but that only makes the test fail and it move on to the next test item, or if they are all exhausted, move on to the TeamWhiteList function. I intentionally did not add a function to cause the entire process to fail on a negative match, that is what I was considering with a regexBlacklist.


To address the inline code comments, you are correct that it would be best to compile the regex at startup and pass that compiled regex.

The most efficient method would be to aggregate every regex into a large statement by concatenating the strings with pipes (|) to denote a long regex with OR statements. The downside of this approach is that you cannot tell which regex statement matched (unless I am missing something about golang and there is some way to output the matching regex syntax).

The slightly less efficient method would be to compile the regex to an array in cfg.go

var CompiledRegex []*regexp.Regexp
if len(Cfg.RegexWhiteList) != 0 {
	for i, wl := range Cfg.RegexWhiteList {
		reWhiteList, reWhiteListErr := regexp.Compile(wl)
		if reWhiteListErr != nil {
			return fmt.Errorf("Uncompilable regex parameter: '%v'", wl)
		}
		CompiledRegex = append(CompiledRegex, reWhiteList)
		log.Debugf("compiled regex is %v = %v", CompiledRegex[i], reWhiteList)
		if  CompiledRegex[i].MatchString(wl) {
			log.Debugf("%v is not a regex statement and will match any occucrence of %v", wl)
		}
	}
	log.Debugf("compiled regex array %v", CompiledRegex)
}

Then we iterate over the array of compiled regex statements in handlers.go.

I'm not sure of the best way to get this array over to handlers.go, but it seems to me that a global variable would be appropriate, maybe you have a better route.

P.S. I wont take any critique of my code quality personal in any way. I know it's probably not great as I am very unfamiliar with golang overall and had to do quite a bit of googling to even accomplish the patch that I did.

danielewood avatar Apr 28 '20 23:04 danielewood

Refactored to pass a compiled array (cfg.CompiledRegex) to handlers.go.

danielewood avatar Apr 29 '20 17:04 danielewood

Refactored for changes in auth.go and handlers_test.go

danielewood avatar May 05 '20 22:05 danielewood