custodia icon indicating copy to clipboard operation
custodia copied to clipboard

[PoC] Provide storage backend for HashiCorp Vault

Open tiran opened this issue 8 years ago • 10 comments
trafficstars

We like to show how to integrate Custodia with existing secrets storage solution such as HashiCorp Vault or other solutions a mentioned at https://medium.com/on-docker/secrets-and-lie-abilities-the-state-of-modern-secret-management-2017-c82ec9136a3d . Vault from HashiCorp is one of the well-known and established backends.

You can use https://github.com/latchset/custodia.ipa as a template for a HC Vault plugin.

tiran avatar Mar 15 '17 16:03 tiran

I'm interested in this, and think I'll wind up needing it. Might wind up writing it. I'll keep in touch. Presumably this would use the HVAC Python API.

Alan-R avatar Jul 06 '17 15:07 Alan-R

@Alan-R we don't have any resources to work on a HashiCorp Vault plugin in the near future. But that doesn't have to stop you. Custodia is extensible and you can easily write your own external plugin. I opened #223 to document how to write a plugin.

tiran avatar Aug 07 '17 13:08 tiran

I already figured out I could write one. Undecided on whether to do that, or take an alternative that I mentioned in email. Documentation of the interface I need to match, and how to configure it would make it easier. Thanks for looking into that with #223.

-- Alan Robertson [email protected]

On Mon, Aug 7, 2017, at 07:21 AM, Christian Heimes wrote:

@Alan-R[1] we don't have any resources to work on a HashiCorp Vault plugin in the near future. But that doesn't have to stop you. Custodia is extensible and you can easily write your own external plugin. I opened #223[2] to document how to write a plugin.> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[3], or mute the thread[4].>

Links:

  1. https://github.com/alan-r
  2. https://github.com/latchset/custodia/issues/223
  3. https://github.com/latchset/custodia/issues/128#issuecomment-320661446
  4. https://github.com/notifications/unsubscribe-auth/ABM0oyFoYzlRtYXOZt8nt3gLmBeH7ZFqks5sVw9QgaJpZM4MeK52

Alan-R avatar Aug 07 '17 13:08 Alan-R

Regarding race conditions...

If the client has exited, then your socket won't work anyway, and you can't proxy any secrets. But there's a simple way to detect if the client has exited... Perform the socket call again after reading things from /proc, and so on, and if it fails or gives different results - you've hit the race condition. Fail the authentication. But if the authentication succeeds, and the socket still works, and you've double-checked the socket call results, then you're connected to the original, a child, or someone your client has given the socket to - so they are trusted by the client. They can always give away the secrets if they want to. So, giving away an authenticated socket is not something for you to worry about. In the race condition case, you may have gotten some incorrect or polluted data, but you're not going to use it - so it's OK. However, most likely you'll get a failure of reading it, etc.

Alan Robertson [email protected]

On Mon, Aug 7, 2017, at 07:27 AM, Alan Robertson wrote:

I already figured out I could write one. Undecided on whether to do that, or take an alternative that I mentioned in email.> Documentation of the interface I need to match, and how to configure it would make it easier. Thanks for looking into that with #223.>

-- Alan Robertson [email protected]

On Mon, Aug 7, 2017, at 07:21 AM, Christian Heimes wrote:

@Alan-R[1] we don't have any resources to work on a HashiCorp Vault plugin in the near future. But that doesn't have to stop you. Custodia is extensible and you can easily write your own external plugin. I opened #223[2] to document how to write a plugin.>> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[3], or mute the thread[4].>>

Links:

  1. https://github.com/alan-r
  2. https://github.com/latchset/custodia/issues/223
  3. https://github.com/latchset/custodia/issues/128#issuecomment-320661446
  4. https://github.com/notifications/unsubscribe-auth/ABM0oyFoYzlRtYXOZt8nt3gLmBeH7ZFqks5sVw9QgaJpZM4MeK52

Alan-R avatar Aug 09 '17 16:08 Alan-R

No, this is not how the race condition works.

A malicious race condition works this way:

  1. Open the socket but do not send any data. Important not the results of getpeercred() are set in stone at this point.
  2. fork()
  3. exit() the original process
  4. wait for the original pid to be reused by another container
  5. ask for that container secrets

This is not hard or far fetched. And if you can create many children you can also relatively easily restrict the pool of PIDs the kernel can choose from.

Of course if you are in a proper containerized environment your process will have no visibility of when a process ends up reusing the PID, and there is also the PID/UID/GID consistsency issue to overcome. However it is not a race that can be ignored, and needs to be properly handled.

Simo.

On Wed, 2017-08-09 at 16:22 +0000, Alan Robertson wrote:

Regarding race conditions...

If the client has exited, then your socket won't work anyway, and you can't proxy any secrets.  But there's a simple way to detect if the client has exited... Perform the socket call again after reading things from /proc, and so on, and if it fails or gives different results - you've hit the race condition. Fail the authentication. But if the authentication succeeds, and the socket still works, and you've double-checked the socket call results, then you're connected to the original, a child, or someone your client has given the socket to - so they are trusted by the client. They can always give away the secrets if they want to. So, giving away an authenticated socket is not something for you to worry about. In the race condition case, you may have gotten some incorrect or polluted data, but you're not going to use it - so it's OK. However, most likely you'll get a failure of reading it, etc.

  Alan Robertson   [email protected]

On Mon, Aug 7, 2017, at 07:27 AM, Alan Robertson wrote:

I already figured out I could write one. Undecided on whether to do that, or take an alternative that I mentioned in email.>  Documentation of the interface I need to match, and how to configure it would make it easier. Thanks for looking into that with #223.> 

--   Alan Robertson   [email protected]

On Mon, Aug 7, 2017, at 07:21 AM, Christian Heimes wrote:

@Alan-R[1] we don't have any resources to work on a HashiCorp Vault plugin in the near future. But that doesn't have to stop you. Custodia is extensible and you can easily write your own external plugin. I opened #223[2] to document how to write a plugin.>> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[3], or mute the thread[4].>> 

Links:

  1. https://github.com/alan-r   2. https://github.com/latchset/custodia/issues/223   3. https://github.com/latchset/custodia/issues/128#issuecomment-320 661446   4. https://github.com/notifications/unsubscribe-auth/ABM0oyFoYzlRtY XOZt8nt3gLmBeH7ZFqks5sVw9QgaJpZM4MeK52

-- Simo Sorce Sr. Principal Software Engineer Red Hat, Inc

simo5 avatar Aug 09 '17 17:08 simo5

Thanks for the overview.

I'm having trouble with this phrase. "Important not the results of getpeercred() are set in stone". I think you want to do the getpeercred() operation right away, and determine their identity. You can know their identity without them saying anything. And, in fact, you shouldn't rely on anything they say anyway ;-). So, if you do the getpeercred() immediately, and then again when they ask for secrets - if anything changes - give them EFINGER. if one of your children wants access to secrets, let them ask for themselves. In fact, if you don't know who they are immediately - close the socket and get out. It's easy to determine whether the process is either the "lead dog" process in a container, or one of its immediate children. You can restrict asking for secrets to either the lead dog or one of its immediate children. This is a very common case. You have three sources of data to use to in authenticating the process:

  • the results of the SO_PEERCRED call
  • the contents of /proc for that process
  • the contents of docker inspect output for the container indicated by /proc It's annoying that one can't get /proc//exe without the CAP_SYS_PTRACE capability. -- Alan Robertson [email protected]

On Wed, Aug 9, 2017, at 11:13 AM, Simo Sorce wrote:

No, this is not how the race condition works.

A malicious race condition works this way:

  1. Open the socket but do not send any data. Important not the results of getpeercred() are set in stone at this point.
  2. fork()
  3. exit() the original process
  4. wait for the original pid to be reused by another container
  5. ask for that container secrets

This is not hard or far fetched. And if you can create many children> you can also relatively easily restrict the pool of PIDs the kernel can> choose from.

Of course if you are in a proper containerized environment your process> will have no visibility of when a process ends up reusing the PID, and> there is also the PID/UID/GID consistsency issue to overcome. However> it is not a race that can be ignored, and needs to be properly handled.> Simo.

On Wed, 2017-08-09 at 16:22 +0000, Alan Robertson wrote:

Regarding race conditions...

If the client has exited, then your socket won't work anyway, and you> > can't proxy any secrets. But there's a simple way to detect if the> > client has exited... Perform the socket call again after reading things from /proc, and so> > on, and if it fails or gives different results - you've hit the race> > condition. Fail the authentication. But if the authentication succeeds, and the socket still works, and you've double-checked the socket call> > results, then you're connected to the original, a child, or someone> > your client has given the socket to - so they are trusted by the client.> > They can always give away the secrets if they want to. So, giving away an> > authenticated socket is not something for you to worry about. In the race condition case, you may have gotten some incorrect or polluted data, but you're not going to use it - so it's OK. However,> > most likely you'll get a failure of reading it, etc.

Alan Robertson [email protected]

On Mon, Aug 7, 2017, at 07:27 AM, Alan Robertson wrote:

I already figured out I could write one. Undecided on whether to do> > > that, or take an alternative that I mentioned in email.> Documentation of the interface I need to match, and how to configure it would make it easier. Thanks for looking into that with #223.>> > >

Alan Robertson [email protected]

On Mon, Aug 7, 2017, at 07:21 AM, Christian Heimes wrote:

@Alan-R[1] we don't have any resources to work on a HashiCorp Vault plugin in the near future. But that doesn't have to stop you. Custodia is extensible and you can easily write your own external> > > > plugin. I opened #223[2] to document how to write a plugin.>> —> > > > You are receiving this because you were mentioned. Reply to this> > > > email directly, view it on GitHub[3], or mute the thread[4].>>> >

Links:

  1. https://github.com/alan-r
  2. https://github.com/latchset/custodia/issues/223
  3. https://github.com/latchset/custodia/issues/128#issuecomment-320> > 661446
  4. https://github.com/notifications/unsubscribe-auth/ABM0oyFoYzlRtY> > XOZt8nt3gLmBeH7ZFqks5sVw9QgaJpZM4MeK52

-- Simo Sorce Sr. Principal Software Engineer Red Hat, Inc

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[1], or mute the thread[2].>

Links:

  1. https://github.com/latchset/custodia/issues/128#issuecomment-321320577
  2. https://github.com/notifications/unsubscribe-auth/ABM0o6aciq7CPEmVihxuZCcU0nZehejJks5sWeixgaJpZM4MeK52

Alan-R avatar Aug 10 '17 00:08 Alan-R

On Thu, 2017-08-10 at 00:05 +0000, Alan Robertson wrote:

Thanks for the overview.

I'm having trouble with this phrase. "Important not the results of getpeercred() are set in stone".

not -> note, sorry.

I think you want to do the getpeercred() operation right away, and determine their identity. You can know their identity without them saying anything. And, in fact, you shouldn't rely on anything they say anyway ;-). So, if you do the getpeercred() immediately, and then again when they ask for secrets - if anything changes - give them EFINGER.

Nothing can change, the "creds" of the peer are sourced at open() socket time for the client and stored in a kernel structure with the file descriptor. Even if the socket is passed around or fork/exec happens on the client side you will always get the original PID,UID,GID of the process that open()ed the socket.

This is the text describing the function in socket(7):

       SO_PEERCRED               Return  the  credentials of the foreign process connected to this socket.  This is possible only for con‐               nected AF_UNIX stream sockets and AF_UNIX stream and datagram socket pairs created  using  socketpair(2);               see  unix(7).   The  returned  credentials  are those that were in effect at the time of the call to con‐               nect(2) or socketpair(2).  The argument is a ucred structure; define the GNU_SOURCE feature test macro to               obtain the definition of that structure from <sys/socket.h>.  This socket option is read-only.

Note the sentence in the middle, it is very important.

Restricting to the first process or immediate children would not be useful, an attacker would just have to wait until a "first" process happens to reuse the PID.

So this is why getpeercred() while super-convenient has a potential race.

Note, the race is very small, especially if the server crawls /proc as soon as a connection is made (rather than when data is received), so in practice I do not feel too bad using this, aside for the fact that crawling proc to find cgroups and use that as a convention to query the container runtime is all a bit unclean.

Simo.

-- Simo Sorce Sr. Principal Software Engineer Red Hat, Inc

simo5 avatar Aug 10 '17 16:08 simo5

Eh of course open() -> connect() above.

simo5 avatar Aug 10 '17 16:08 simo5

Thanks much for showing me how that exploit works. Anything you can do to shorten that interval is good of course. But it sounds like that's all you can really do - except verify that the userid and argument list of of the requestor matches that which Docker claims that it is. That information is "up to date" at that time. That increases the amount of information they have to know in order to fool you. They have to do it when they initially start, and when the request is made. It's likely they will fail a lot on the way to succeeding. If anyone looks at the audit/security logs that will stand out. Of course, it's not certain that they'll fail a lot, and it's not certain that anyone will look at the logs. Sadly, in many places, it's highly unlikely that anyone will notice.

Alan-R avatar Aug 10 '17 16:08 Alan-R

Nodding on all your remarks

simo5 avatar Aug 10 '17 17:08 simo5