tag-security icon indicating copy to clipboard operation
tag-security copied to clipboard

Help with an extra set of eyes on some security critical (windows) code?

Open evan2645 opened this issue 2 years ago • 10 comments

Hello everybody! I'm one of the maintainers on the SPIFFE/SPIRE projects, and we are currently working on support for Windows. As part of this work, we have ~500 lines of extra sensitive security critical code. We'd really like to get some "external" review of this code, particularly from folks with deep knowledge of the Windows APIs. I asked @anvega if the Security TAG would be interested/willing to review, and he suggested I post here (thanks Andres!).

In the interest of getting things rolling, I'm providing some background information below. If this is better directed elsewhere, please do let me know and I can close this out. Thanks in advance!

Background

SPIRE performs what we call "workload attestation", which is the process of independently "proving" the identity of a workload without requiring it to present any credentials. This is accomplished through 1) the resolution of a local process PID given only basic caller information (in this case, a TCP caller), and 2) the discovery of the natural properties or attributes of this process. These properties or attributes are then used as inputs to a lookup in order to figure out what identity should be issued, thus solving the secure introduction problem at the node level.

The Challenge

PID race and reuse is a major concern. The SPIRE team has spent years growing deep knowledge and long list of mitigations against such attacks under Linux and the BSD family, including Darwin. We are only now starting this journey on Windows.

Connections can be long-lived, and practically always, our caller information is bound to the connection at the time that Accept unblocks. Workload attestation is the very first line of defense in SPIRE, and the SPIRE threat model declares the workloads calling this API to be wholly untrusted. As a result, the security properties of the workload attestation mechanisms are absolutely critical.

The PR

Please find below a link to the PR with the code in question. SPIRE implements this logic in a package called peertracker, whose job is to implement platform-specific code for solving secure PID resolution and reference. Its exposed interface is generally a listener, and when conns are accepted off of this listener, the caller PID is resolved and a watcher is started to track the liveness of the process. Consumers of this package then go on to perform process introspection, and at the end of this introspection they call a function to verify that the original process is still alive. If it is not, the introspection fails.

With the introduction of Windows support is also support for a TCP-based listener. We did not want to do this because we feel it fractures the SPIRE experience across different platforms, however we could not find the APIs to make this work over a Unix Domain Socket in Windows.

Finally, even the current APIs we get are not race free. We try to minimize the window and install mitigations where possible. Ideally, we can have a race-free API in Windows :)

Thanks again for your time on this!

https://github.com/spiffe/spire/pull/2695

/cc @amartinezfayo

evan2645 avatar Feb 02 '22 00:02 evan2645

@evan2645 @anvega there are some folks over in k8s sig-windows who may be able to help please see: https://kubernetes.slack.com/archives/C0SJ4AFB7/p1643762569114879

dims avatar Feb 02 '22 13:02 dims

We might also write this up as an academic paper or a blog and try to get relevant folks at Microsoft to take a look...

On Wed, Feb 2, 2022 at 8:02 AM Evan Gilman @.***> wrote:

Hello everybody! I'm one of the maintainers on the SPIFFE/SPIRE projects, and we are currently working on support for Windows. As part of this work, we have ~500 lines of extra sensitive security critical code. We'd really like to get some "external" review of this code, particularly from folks with deep knowledge of the Windows APIs. I asked @anvega https://github.com/anvega if the Security TAG would be interested/willing to review, and he suggested I post here (thanks Andres!).

In the interest of getting things rolling, I'm providing some background information below. If this is better directed elsewhere, please do let me know and I can close this out. Thanks in advance! Background

SPIRE performs what we call "workload attestation", which is the process of independently "proving" the identity of a workload without requiring it to present any credentials. This is accomplished through 1) the resolution of a local process PID given only basic caller information (in this case, a TCP caller), and 2) the discovery of the natural properties or attributes of this process. These properties or attributes are then used as inputs to a lookup in order to figure out what identity should be issued, thus solving the secure introduction problem at the node level. The Challenge

PID race and reuse is a major concern. The SPIRE team has spent years growing deep knowledge and long list of mitigations against such attacks under Linux and the BSD family, including Darwin. We are only now starting this journey on Windows.

Connections can be long-lived, and practically always, our caller information is bound to the connection at the time that Accept unblocks. Workload attestation is the very first line of defense in SPIRE, and the SPIRE threat model declares the workloads calling this API to be wholly untrusted. As a result, the security properties of the workload attestation mechanisms are absolutely critical. The PR

Please find below a link to the PR with the code in question. SPIRE implements this logic in a package called peertracker, whose job is to implement platform-specific code for solving secure PID resolution and reference. Its exposed interface is generally a listener, and when conns are accepted off of this listener, the caller PID is resolved and a watcher is started to track the liveness of the process. Consumers of this package then go on to perform process introspection, and at the end of this introspection they call a function to verify that the original process is still alive. If it is not, the introspection fails.

With the introduction of Windows support is also support for a TCP-based listener. We did not want to do this because we feel it fractures the SPIRE experience across different platforms, however we could not find the APIs to make this work over a Unix Domain Socket in Windows.

Finally, even the current APIs we get are not race free. We try to minimize the window and install mitigations where possible. Ideally, we can have a race-free API in Windows :)

Thanks again for your time on this!

spiffe/spire#2695 https://github.com/spiffe/spire/pull/2695

/cc @amartinezfayo https://github.com/amartinezfayo

— Reply to this email directly, view it on GitHub https://github.com/cncf/tag-security/issues/853, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGROD5B73IERHJ3EHM5T73UZBYBLANCNFSM5NKS53XA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

JustinCappos avatar Feb 06 '22 03:02 JustinCappos

We might also write this up as an academic paper or a blog and try to get relevant folks at Microsoft to take a look...

😍 What can I do to help?

@evan2645 @anvega there are some folks over in k8s sig-windows who may be able to help please see: https://kubernetes.slack.com/archives/C0SJ4AFB7/p1643762569114879

Thank you for this @dims, I dropped a message on that thread but we haven't heard much back just yet. I'll drop another message there today ... it sounds like we're almost to the end of our internal review process. Thanks again for your help everyone ❤️

evan2645 avatar Feb 10 '22 19:02 evan2645

Small and tangential update for anyone who may be following along here:

With the introduction of Windows support is also support for a TCP-based listener. We did not want to do this because we feel it fractures the SPIRE experience across different platforms, however we could not find the APIs to make this work over a Unix Domain Socket in Windows.

Unfortunately, we learned recently that Envoy requires TLS on non-UDS based SDS sessions, which means that SPIRE will likely not be able to support Envoy on Windows unless we can find a way to use UDS instead of TCP. We're looking at potential workarounds etc.

evan2645 avatar Feb 15 '22 20:02 evan2645

Hello again - we've been unable to get any non-maintainer review of this code so far. It was merged some weeks ago already, and will probably be officially released sometime in the next couple months.

We'd still really like to get a review .. but are not sure how to proceed. Happy to close this out since it seems to just be hanging around, but wondering if there is somewhere else I can/should direct this?

evan2645 avatar Mar 18 '22 02:03 evan2645

This issue has been automatically marked as inactive because it has not had recent activity.

stale[bot] avatar Jun 04 '22 02:06 stale[bot]

Hi @evan2645 - should i close this issue or is this still an ongoing ask?

lumjjb avatar Jun 04 '22 15:06 lumjjb

This issue has been automatically marked as inactive because it has not had recent activity.

stale[bot] avatar Aug 12 '22 04:08 stale[bot]

Hi @lumjjb sorry for the delay, I've been out on leave

We were never able to get a review of the code in question, but it's never too late and we'd still love someone to look at it. At the same time, I don't know that keeping this issue open any longer is going to improve the odds of that, so if we want to call it quits and close it out, I think that's ok. And apologies for the oddball issue 😅 I'm not sure how you normally manage such cases

evan2645 avatar Aug 12 '22 16:08 evan2645

yea i think we'll keep it open for now, seems worthwhile just in case someone stumbles upon it. We are starting something around zero knowledge (#950) so may have some new members come in!

lumjjb avatar Aug 12 '22 19:08 lumjjb

This issue has been automatically marked as inactive because it has not had recent activity.

stale[bot] avatar Oct 16 '22 08:10 stale[bot]

This issue has been automatically marked as inactive because it has not had recent activity.

stale[bot] avatar Dec 24 '22 04:12 stale[bot]

Closing this issue as the code was released, it didn't receive solicited attention. Hopefully, in the future, we can do better to coordinate assistance.

anvega avatar Jun 21 '23 01:06 anvega