osm icon indicating copy to clipboard operation
osm copied to clipboard

[Research] Explore SPIFFE/SPIRE Integration

Open keithmattix opened this issue 2 years ago • 7 comments

Please describe the Improvement and/or Feature Request

SPIRE is a CNCF incubating project for platform-agnostic workload identity. At a very basic level, SPIRE could be used as a CA integration for OSM. However, SPIRE also provides the opportunity to for OSM to leverage the SPIFFE protocol to unlock heterogeneous workloads (e.g. VMs, severless, etc.)

Scope (please mark with X where applicable)

  • New Functionality [X]
  • Certificate Management [X]

Possible Deliverables

  • Pros and cons comparing SPIFFE to the current service identity mechanism in OSM. Especially explore the scenario if they are used in multicluster environment
  • Rough task list if we migrate from current service identity component to SPIFFE.
  • (optional) POC using SPIFFE in OSM

keithmattix avatar May 19 '22 16:05 keithmattix

Added default label size/needed. Please consider re-labeling this issue appropriately.

github-actions[bot] avatar Jul 13 '22 00:07 github-actions[bot]

Do we want to add this to v1.3? If we plan to adopt SPIFFE and SPIRE in the multicluster design, this item can be a prerequisite of it.

allenlsy avatar Jul 21 '22 21:07 allenlsy

Sure, it's probably worth exploring the extent to which we may want to utilize SPIFFE/SPIRE. @trstringer your thoughts?

keithmattix avatar Jul 21 '22 21:07 keithmattix

I suggest a deliverable for this issue:

  • Pros and cons comparing SPIFFE to the current service identity mechanism in OSM. Especially explore the scenario if they are used in multicluster environment
  • Rough task list if we migrate from current service identity component to SPIFFE.
  • (optional) POC using SPIFFE in OSM

allenlsy avatar Jul 21 '22 21:07 allenlsy

Good idea! I added it to the description

keithmattix avatar Jul 21 '22 22:07 keithmattix

Yeah this works for me! Adding it to v1.3.

trstringer avatar Jul 22 '22 01:07 trstringer

/assign

jsturtevant avatar Aug 02 '22 19:08 jsturtevant

I've got my raw notes in this gist. I will summarize the outcomes here:

spiffe integration in OSM

Pros and cons comparing SPIFFE to the current service identity mechanism in OSM.

The aspects of SPIFFE that we are primarily interested in are the x509 SVIDs and the Spiffe ID. In the future we may be interested in a few the other APIs specified but not required initially.

The SPIFFE specs doesn't specify how you use this identity just how it is encoded in the x509 certificate. In our case we have an existing abstraction around identity. We don't really need to change our identity mechanism but instead make it compatible with the SPIFFE specs. For more information on OSM identity see identity PR

Some of the benefits I see for this are:

  • By adhering to specifications it enables integrations with rest of eco system (see "Who uses SPIFFE?" at https://spiffe.io/). Once able to use the specification things like OPA integrations and OIDC integrations become options
  • Opens the door for integrations with SPIRE. Spire has a few advantages that would likely be worth exploring further like workload attestation plugins, federation, certification rotation. Attestation could be a big draw here.
  • potential for a way to reason about using across multiple types of workload beyond kubernetes. Though the naming schema is still left up to the implementors. Our current schema might work but needs some additional thought.

Especially explore the scenario if they are used in multi-cluster environment

At this point I believe SPIFFE isn't a requirement for multi-cluster. It could provide a way to simplify federation and allow for Attestation across different platforms. I would need to dive into the multi-cluster world more to understand what is really needed. Federation could be something really useful here.

tasks list

Rough task list if we migrate from current service identity component to SPIFFE.

  • [ ] document current osm Identity solution #5004
  • [ ] Switch from MatchSubjectAltNames to MatchTypedSubjectAltNames in Envoy config #5019
  • [ ] Add feature flag that turns spiffe id generation on
  • [ ] Add feature flag that turns on spiffe id validation (this can't be done at same time in backwards compatible way with out re-issuing all the Service Certificates). It might be possible to automate this but might be just as easy to just wait given time frame to flip a switch or start a new deployment with this enabled.
  • [ ] Implement spiffe ID generation in x509 certificates
    • [ ] clean up our IssueCertificate Interface, right now it uses prefixes and it could benefit from more structure to make parsing decisions easier later
    • [ ]
  • [ ] Implement spiffeid x509 validation for RBAC controls
  • [ ] Implement using spiffeid to make RBAC decisions
  • [ ] Enable control plane certificates issue x509 SVIDS (optional)
  • [ ] Implement bundle API for federation with other systems (optional)
  • [ ] Integrate envoy certificate validator (optional)
  • [ ] Add support in the OMS debugger to view this addition certificate fields

Spire integration

There are several stages that could be implemented for the spire integration. This would need to be completed after

  • using SPIRE for the Pod to Pod service certificates
    • model this integration after Istio integration (https://blog.spiffe.io/hardening-istio-security-with-spire-d2f4f98f7a63) by using SPIRE's SDS server since Envoy has support
  • using SPIRE as OSM Certificate Provider. This could be done in two ways:
    • Writing our own certificate provider that knows how to call SPIRE.
    • might be able to leverage https://github.com/spiffe/spire/blob/v1.4.0/doc/plugin_server_upstreamauthority_cert_manager.md

To be done for SPIRE integrations:

  • [ ] Prototype of SPIRE using SDS server
  • [ ] prototype of using SPIRE as certificate provider

prototype

(optional) POC using SPIFFE in OSM

I've got a POC of using SPIFFE IDs and x509 SVIDs at https://github.com/openservicemesh/osm/compare/main...jsturtevant:osm:spiffeid?expand=1. This could be used to do spire prototype.

jsturtevant avatar Aug 23 '22 22:08 jsturtevant

I've taken comment https://github.com/openservicemesh/osm/issues/4750#issuecomment-1224943168 and split it out into separate issues so work can be broken down. See mentioned issues for future work.

#5030 #5031

jsturtevant avatar Aug 24 '22 16:08 jsturtevant