cadence icon indicating copy to clipboard operation
cadence copied to clipboard

OAuth public/private key concerns

Open vytautas-karpavicius opened this issue 3 years ago • 7 comments

Recent PR #4364 made a change where it is possible to specify private key per cluster.

This got my attention because it implies that different clusters may have different key pairs. https://github.com/uber/cadence/blob/5191468f2297dc7666c574fe74492fae4e85b58a/common/config/cluster.go#L70

Another thing I noticed that for verification we have only one public key per cluster. https://github.com/uber/cadence/blob/7aca8298408de8ef3b2b4035f31f63b27f3821ed/common/authorization/oauthAuthorizer.go#L65

Here are my concerns:

  • If we assume that all of cadence server + worker deployments would fall under single owner, then having multiple key-pairs per cluster does not make sense, single would do.
  • Otherwise, I think we got it backwards.

Current setup implies that a cluster have a key-pair with public key on its side to validate JWT tokens. Since it has one public key to validate tokens against, this means that private key counterpart has to be distributed for signing parties. If this falls under single owner it may not be a problem.

But consider several parties. For this case you do not want to share your private key with them, as this is private by definition. Now consider the opposite scenario. Each party generates its own key-pair, keeps their private key for signing and give Cadence server maintainer their public counterpart. Now the maintainer could add the public key, but only one key can be configured, and there are several parties.

Lets review a similar pattern - ssh connection. The host has a config file ~/.ssh/authorized_keys where multiple public keys can be added that are allowed to login. Another example github itself - you can add multiple public keys to it, that are allowed to login.

I suggest the following:

  • we should accept and validate against the list of public keys (not one).
  • cluster should contain only one private key for himself (not a list of them per each replication cluster).

@noiarek @longquanzheng

vytautas-karpavicius avatar Sep 29 '21 10:09 vytautas-karpavicius

Hi @vytautas-karpavicius Thanks for the call out.

we should accept and validate against the list of public keys (not one).

Agreed. I have it in the todo list in the design doc. But it’s not highest priority at the moment so we didn’t implement it. The assumption is that private key will be managed by a centralized service which return the jwt only. So we shouldn’t distribute the private keys. For example, Uber may have its own way of doing service Auth. You will build a centralized service to own the private key and return jwt to other services.

Btw the only implementation in worker using private key is just example only. It’s not meant for actual production cases . Since each company will have their own service Auth mechanism, they should implement the interface themselves (mostly they won’t use private/public keys . More often they would use iam or even api keys)

However, it will still make sense to allow a list of public keys for rotation purposes. So I have it as a todo for this design

longquanzheng avatar Sep 30 '21 02:09 longquanzheng

cluster should contain only one private key for himself (not a list of them per each replication cluster).

yes or no. It’s going to be difficult to guarantee all clusters have the same key pairs, and it’s not necessary to do so. Since each cluster will have its own config they can certainly be different. Just from the structure point of view, since we allow each cluster have their own config, the public key should allow to be different as well.

But it doesn’t mean they have to be different. In your/some cases, if you want them all to be the same, it’s okay and it’s allowed.

Just from a perspective of what we can provide and not, I think allowing different key pairs make more sense than require every cluster to use the same.

For example, a company may have different teams started their own Cadence clusters and later on they want to connect them together in order to use XDC/cross cluster feature. This would open the possibility. Even the same team owning different clusters may want to use different keys for security purposes.

longquanzheng avatar Sep 30 '21 02:09 longquanzheng

Thank you @longquanzheng for clarifications and detailed answers.

Ok, I see that multiple key pairs is indeed required and may be useful in some cases. I'm not against it and it does make sense.

The issue I see, is not the fact that clusters will have different key pairs, but rather the way they are currently configured. Right now we have clusterGroup config section where private keys are specified. It has a private key for current cluster (all good) as well as remote clusters - this is is part that I'm not comfortable with. It means that private key of remote cluster need to be distributed here.

If we were to have a list of public keys to validate against, listing private keys for remote clusters would not be needed. The current cluster will sign it with it own private key only, no matter the destination. The remote cluster would have a list of public keys of corresponding callers that are allowed.

If some centralized service or another implementation that manages the keys is used that may not be an issue - true. But for this particular implementation it can be.

vytautas-karpavicius avatar Sep 30 '21 09:09 vytautas-karpavicius

It has a private key for current cluster (all good) as well as remote clusters - this is is part that I'm not comfortable with. It means that private key of remote cluster need to be distributed here.

Right. I do agree the concern, and looks weird. There is a little more background here -- TChannel doesn't support TLS so we have to apply Authorization for cross cluster requests. However, if we have done the Authentication using gRPC, this part can be removed/skipped. As they are all Cadence clusters so we should trust them once requests are authenticated. (This is also what Temporal has done)

longquanzheng avatar Oct 01 '21 19:10 longquanzheng

https://github.com/uber/cadence/pull/4492 looks like interns traffic are using gRPC today. Does it include traffic across cluster? If so we can add authentication to it and then remove the authorization provider per cluster config — for those cross cluster traffic, as long as it passes authentication, there is no need to perform authorization checks

longquanzheng avatar Oct 09 '21 03:10 longquanzheng

Yes internal traffic is on gRPC by default already. For cross DC traffic, it need to be enabled like this: https://github.com/uber/cadence/blob/master/config/development_xdc_cluster0.yaml#L82

Yeah, that make sense. We will need to expose config for that.

vytautas-karpavicius avatar Oct 11 '21 10:10 vytautas-karpavicius

@vytautas-karpavicius Just realized that sys worker is broken after that PR: Because sysWorker uses publicClient.hostPort which default to be the same as currentCluster's RPC address. This is a fix: https://github.com/uber/cadence/pull/4560

longquanzheng avatar Oct 11 '21 16:10 longquanzheng