osm icon indicating copy to clipboard operation
osm copied to clipboard

[Refactoring Proposal] Abstraction for Proxy Control Plane component - comes with refactoring & reference implementation

Open naqvis opened this issue 1 year ago • 8 comments

Reference: https://github.com/openservicemesh/osm/issues/4632 Proposal wiki link : Abstractions for OSM Proxy Control Plane

Consideration has been taken to keep the changes transparent to osm end-user and defaults to current envoy as driver, but users wishing to proceed with 3rd party driver implementation can achieve that via osm installation flag --set=osm.sidecarClass=XXX where XXX refers to sidecar Driver.

This refactoring proposal is being raised for discussion and once an agreement (if any) has been made, we will proceed with raising a PR with required code changes.

Proposal

This proposal is to provide abstractions for OSM Proxy Control Plane component, for it to break hard-coded dependencies on current envoy-based implementation, move current envoy-based implementation as a reference implementation, and pave the way for future integration of 3rd party sidecar proxy services.

Background

OSM Sidecar Proxy Control Plane is one of the core components and plays a key part in operating the service mesh. This component implements the interfaces required by specific sidecar proxy and provides continuous configuration updates to sidecar proxies. OSM current implementation of this key component is solely based on Envoy’s go-control-plane xDS v3 API, thus having a strong dependency or tight-coupling on Envody (vendor-specific protocol and/or sidecar).

And key focus of this proposal is to:

  • Provide an abstraction (Sidecar Driver Interface)
  • Re-factor existing implementation to implement the abstraction and break the coupling.
  • Open doors to 3rd parties for easy integration of sidecar

Architecture Diagram

Architecture Diagram

The Proxy Control plane component will be responsible for the life cycle management of sidecar drivers and communicating with drivers using the Sidecar Driver interface.

Vendors wishing to extend or provide a sidecar will implement the abstraction and provide configuration updates to their respective sidecar implementations in format/manners which suit their sidecar-specific needs.

Sidecar Driver Lifecycle and Proxy Control Plane Interaction

Sidecar Driver Lifecycle

Sidecar Driver Interface

A Sidecar Driver needs to implement the Driver interface to integrate with the OSM Injector and Controller.

// Driver is an interface that must be implemented by a sidecar driver.
// Patch method is invoked by osm-injector and Start method is invoked by osm-controller
type Driver interface {
	Patch(ctx context.Context) error
	Start(ctx context.Context) (health.Probes, error)
}

Start returns Sidecar driver HealthProbes

// HealthProbes is to serve as an indication how to probe the sidecar driver's health status
type HealthProbes struct {
	liveness, readiness, startup *HealthProbe
}

// HealthProbe is an API endpoint to indicate the current status of the sidecar driver.
type HealthProbe struct {
	path      string
	port      int32
	http      bool
	timeout   time.Duration
	tcpSocket bool
}

Patch

When a Pod gets deployed, OSM Injector registered webhook will inject configurations for this Pod, such as creating certificates for Sidecar and setting Labels, Annotations, Metrics, etc. Webhook will invoke the Patch method, which needs to implement:

  • [ ] 1. Inject the container and configure network rules for Sidecar, such as forwarding network traffic etc.

  • [ ] 2. Inject the Sidecar container, configure volumes to be attached to Sidecar configuration files, and perform health checks

The information needed for this method is encapsulated in the InjectorContext struct:

type InjectorContext struct {
	context.Context

	Pod                          *corev1.Pod
	MeshName                     string
	OsmNamespace                 string
	PodNamespace                 string
	PodOS                        string
	ProxyCommonName              certificate.CommonName
	ProxyUUID                    uuid.UUID
	Configurator                 configurator.Configurator
	KubeClient                   kubernetes.Interface
	BootstrapCertificate         *certificate.Certificate
	ContainerPullPolicy          corev1.PullPolicy
	InboundPortExclusionList     []int
	OutboundPortExclusionList    []int
	OutboundIPRangeInclusionList []string
	OutboundIPRangeExclusionList []string
	OriginalHealthProbes         HealthProbes

	DryRun bool
}

Start

OSM Controller on start will invoke Start method, which need to implement:

  • [ ] 1. A Sidecar proxy controls plane services and encapsulates SMI traffic policies into vendor's sidecard specific format

  • [ ] 2. Return HealthProbes instance

The information needed for this method is encapsulated in the ControllerContext struct:

type ControllerContext struct {
	context.Context

	ProxyServerPort  uint32
	ProxyServiceCert *certificate.Certificate
	OsmNamespace     string
	KubeConfig       *rest.Config
	Configurator     configurator.Configurator
	MeshCatalog      catalog.MeshCataloger
	CertManager      certificate.Manager
	MsgBroker        *messaging.Broker
	DebugHandlers    map[string]http.Handler
	CancelFunc       func()
	Stop             chan struct {
	}
}

Context

When the Patch and Start methods are invoked, context.WithCancel encapsulates InjectorContext and ControllerContext as context.context:

background := driver.InjectorContext{
...
}
ctx, cancel := context.WithCancel(&background)
defer cancel()
background := driver.ControllerContext{
...
}
ctx, cancel := context.WithCancel(&background)
defer cancel()
background.CancelFunc = cancel

Sidecar driver implementation can access InjectorCtxKey and ControllerCtxKey via:

parentCtx := ctx.Value(&driver.InjectorCtxKey)
if parentCtx == nil {
	return nil, errors.New("missing Injector Context")
}
injCtx := parentCtx.(*driver.InjectorContext)
parentCtx := ctx.Value(&driver.ControllerCtxKey)
if parentCtx == nil {
	return nil, errors.New("missing Controller Context")
}
ctrlCtx := parentCtx.(*driver.ControllerContext)

Reference implementation

pkg/sidecar/driver/types.go

Sidecar Driver Adaptor**

Variables declaration

  • The map object drivers acts as an object container for sidecar drivers
  • Read/write lock driversMutex to ensure thread-safe operation on drivers object
  • engineDriver Keep information of currently in use Sidecar driver
var (
	driversMutex sync.RWMutex
	drivers      = make(map[string]driver.Driver)
	engineDriver driver.Driver
)

Sidecar Driver Registration

Register method is used to register implemented driver.

Note: Multiple invocations of Register method will panic, so its responsibility of Driver implementors to ensure driver is registered only once.

// Register makes a sidecar driver available by the provided name.
// If Register is called twice with the same name or if driver is nil,
// it panics.
func Register(name string, driver driver.Driver) {
	driversMutex.Lock()
	defer driversMutex.Unlock()
	if driver == nil {
		panic("sidecar: Register driver is nil")
	}
	if _, dup := drivers[name]; dup {
		panic("sidecar: Register called twice for driver " + name)
	}
	drivers[name] = driver
}

Sidecar Driver Adaptor

Patch

// Patch is an adapter method for InjectorDriver.Patch
func Patch(ctx context.Context) error {
	driversMutex.RLock()
	defer driversMutex.RUnlock()
	if engineDriver == nil {
		return errors.New("sidecar: unknown driver (forgot to init?)")
	}
	return engineDriver.Patch(ctx)
}

Start

// Start is an adapter method for ControllerDriver.Start
func Start(ctx context.Context) (health.Probes, error) {
	driversMutex.RLock()
	defer driversMutex.RUnlock()
	if engineDriver == nil {
		return nil, errors.New("sidecar: unknown driver (forgot to init?)")
	}
	return engineDriver.Start(ctx)
}

sidecar Driver Initialization

// InitDriver is to serve as an indication of the using sidecar driver
func InitDriver(driverName string) error {
	driversMutex.Lock()
	defer driversMutex.Unlock()
	registeredDriver, ok := drivers[driverName]
	if !ok {
		return fmt.Errorf("sidecar: unknown driver %q (forgotten import?)", driverName)
	}
	engineDriver = registeredDriver
	return nil
}

Reference Implementation

pkg/sidecar/adapter.go

Sidecar Driver Implementations

Envoy Driver

Pipy Driver

Driver Integration with DebugServer

In the Start method of the driver, you can use ControllerContext.DebugHandlers to enrich DebugServer debugging

// EnvoySidecarDriver is the envoy sidecar driver
type EnvoySidecarDriver struct {
	ctx *driver.ControllerContext
}

// Start is the implement for ControllerDriver.Start
func (sd EnvoySidecarDriver) Start(ctx context.Context) (health.Probes, error) {
	parentCtx := ctx.Value(&driver.ControllerCtxKey)
	if parentCtx == nil {
		return nil, errors.New("missing Controller Context")
	}
	ctrlCtx := parentCtx.(*driver.ControllerContext)
	...

	ctrlCtx.DebugHandlers["/debug/proxy"] = sd.getProxies(proxyRegistry)
	ctrlCtx.DebugHandlers["/debug/xds"] = sd.getXDSHandler(xdsServer)

	...
}

Using Sidecar Driver

Register

import (
	...
	_ "github.com/openservicemesh/osm/pkg/sidecar/providers/envoy/driver"
	_ "github.com/openservicemesh/osm/pkg/sidecar/providers/pipy/driver"
	...
)

Driver implementation should use init method for registration purposes.

const (
	driverName = `pipy`
)

func init() {
	sidecar.Register(driverName, new(PipySidecarDriver))
}

Installation

Install respective sidecar driver registered under sidecarClass of MeshConfig

cfg := configurator.NewConfigurator(...)
err = sidecar.InstallDriver(cfg.GetSidecarClass())
if err != nil {
    events.GenericEventRecorder().FatalEvent(err, events.InitializationError, "Error creating sidecar driver")
}

Patch

func (wh *mutatingWebhook) createPatch(pod *corev1.Pod, req *admissionv1.AdmissionRequest, proxyUUID uuid.UUID) ([]byte, error) {
	...

	background := driver.InjectorContext{
		Pod:                          pod,
		MeshName:                     wh.meshName,
		OsmNamespace:                 wh.osmNamespace,
		PodNamespace:                 namespace,
		PodOS:                        podOS,
		ProxyCommonName:              cn,
		ProxyUUID:                    proxyUUID,
		Configurator:                 wh.configurator,
		KubeClient:                   wh.kubeClient,
		BootstrapCertificate:         bootstrapCertificate,
		ContainerPullPolicy:          wh.osmContainerPullPolicy,
		InboundPortExclusionList:     inboundPortExclusionList,
		OutboundPortExclusionList:    outboundPortExclusionList,
		OutboundIPRangeInclusionList: outboundIPRangeInclusionList,
		OutboundIPRangeExclusionList: outboundIPRangeExclusionList,
		OriginalHealthProbes:         originalHealthProbes,
		DryRun:                       req.DryRun != nil && *req.DryRun,
	}
	ctx, cancel := context.WithCancel(&background)
	defer cancel()

	if err = sidecar.Patch(ctx); err != nil {
		return nil, err
	}

	return json.Marshal(makePatches(req, pod))
}

Start

func main() {
	...

	background := driver.ControllerContext{
		ProxyServerPort:  cfg.GetProxyServerPort(),
		ProxyServiceCert: proxyServiceCert,
		OsmNamespace:     osmNamespace,
		KubeConfig:       kubeConfig,
		Configurator:     cfg,
		MeshCatalog:      meshCatalog,
		CertManager:      certManager,
		MsgBroker:        msgBroker,
		DebugHandlers:    make(map[string]http.Handler),
		Stop:             stop,
	}
	ctx, cancel := context.WithCancel(&background)
	defer cancel()
	background.CancelFunc = cancel

	// Create and start the sidecar proxy service
	healthProbes, err := sidecar.Start(ctx)
	if err != nil {
		events.GenericEventRecorder().FatalEvent(err, events.InitializationError, "Error initializing proxy control server")
	}

	...
	// Health/Liveness probes
	funcProbes := []health.Probes{healthProbes, smi.HealthChecker{DiscoveryClient: clientset.Discovery()}}
	...

	// Create DebugServer and start its config event listener.
	// Listener takes care to start and stop the debug server as appropriate
	debugConfig := debugger.NewDebugConfig(certDebugger, meshCatalog, kubeConfig, kubeClient, cfg, k8sClient, msgBroker)
	go debugConfig.StartDebugServerConfigListener(background.DebugHandlers, stop)

	...
}

Reference Implementation

pkg/injector/patch.go

cmd/osm-injector/osm-injector.go

cmd/osm-controller/osm-controller.go

Changes to OSM API

Addition of SidecarClass and SidecarDriverSpec in SidecarSpec.

SidecarClass Specifies the default sidecar driver

// SidecarDriverSpec is the type to represent OSM's sidecar driver define.
type SidecarDriverSpec struct {
	...

	// SidecarClass defines the class used for the proxy sidecar.
	SidecarClass string `json:"sidecarClass,omitempty"`

	// SidecarImage defines the container image used for the proxy sidecar.
	SidecarImage string `json:"sidecarImage,omitempty"`

	// SidecarWindowsImage defines the windows container image used for the proxy sidecar.
	SidecarWindowsImage string `json:"SidecarImageWindowsImage,omitempty"`

	// InitContainerImage defines the container image used for the init container injected to meshed pods.
	InitContainerImage string `json:"initContainerImage,omitempty"`

	// SidecarDrivers defines the sidecar supported.
	SidecarDrivers []SidecarDriverSpec `json:"sidecarDrivers,omitempty"`

	...
}

type SidecarDriverSpec struct {
	// SidecarName defines the name of the sidecar driver.
	SidecarName string `json:"sidecarName,omitempty"`

	// SidecarImage defines the container image used for the proxy sidecar.
	SidecarImage string `json:"sidecarImage,omitempty"`

	// SidecarWindowsImage defines the windows container image used for the proxy sidecar.
	SidecarWindowsImage string `json:"SidecarImageWindowsImage,omitempty"`

	// InitContainerImage defines the container image used for the init container injected to meshed pods.
	InitContainerImage string `json:"initContainerImage,omitempty"`

	// ProxyServerPort is the port on which the Discovery Service listens for new connections from Sidecars
	ProxyServerPort uint32 `json:"proxyServerPort"`

	// SidecarDisabledMTLS defines whether mTLS is disabled.
	SidecarDisabledMTLS bool `json:"sidecarDisabledMTLS"`
}

Reference Implementation

pkg/apis/config/v1alpha1/mesh_config.go

pkg/apis/config/v1alpha2/mesh_config.go

Changes to Helm Charts

values.yaml

osm:
  ...
  # -- The class of the OSM Sidecar Driver
  sidecarClass: envoy
  # -- Sidecar image for Linux workloads
  sidecarImage: envoyproxy/envoy:v1.19.3
  # -- Sidecar drivers supported by osm
  sidecarDrivers:
    - sidecarName: pipy
      # -- Sidecar image for Linux workloads
      sidecarImage: flomesh/pipy-nightly:latest
      # -- Remote destination port on which the Discovery Service listens for new connections from Sidecars.
      proxyServerPort: 6060
    - sidecarName: envoy
      # -- Sidecar image for Linux workloads
      sidecarImage: envoyproxy/envoy:v1.19.3
      # -- Sidecar image for Windows workloads
      sidecarWindowsImage: envoyproxy/envoy-windows:latest
      # -- Remote destination port on which the Discovery Service listens for new connections from Sidecars.
      proxyServerPort: 15128
   ...
  • Configurations settings of sidecarImage , sidecarWindowsImage under osm takes precedence over values defined under osm.sidecarDrivers with same keys.
  • Only if these settigns are missing under osm then values under osm.sidecarDrivers will be choosen.

Reference implementation as below:

// GetSidecarImage returns the sidecar image
func (c *client) GetSidecarImage() string {
	image := c.getMeshConfig().Spec.Sidecar.SidecarImage
	if len(image) == 0 {
		sidecarClass := c.getMeshConfig().Spec.Sidecar.SidecarClass
		sidecarDrivers := c.getMeshConfig().Spec.Sidecar.SidecarDrivers
		for _, sidecarDriver := range sidecarDrivers {
			if strings.EqualFold(strings.ToLower(sidecarClass), strings.ToLower(sidecarDriver.SidecarName)) {
				image = sidecarDriver.SidecarImage
				break
			}
		}
	}
	if len(image) == 0 {
		image = os.Getenv("OSM_DEFAULT_SIDECAR_IMAGE")
	}
	return image
}

preset-mesh-config.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: preset-mesh-config
  namespace: {{ include "osm.namespace" . }}
data:
  preset-mesh-config.json: |
    {
      "sidecar": {
        "enablePrivilegedInitContainer": {{.Values.osm.enablePrivilegedInitContainer | mustToJson}},
        "logLevel": {{.Values.osm.sidecarLogLevel | mustToJson}},
        "maxDataPlaneConnections": {{.Values.osm.maxDataPlaneConnections | mustToJson}},
        "configResyncInterval": {{.Values.osm.configResyncInterval | mustToJson}},
        "sidecarClass": {{.Values.osm.sidecarClass | mustToJson }},
        "sidecarDrivers": {{.Values.osm.sidecarDrivers | mustToJson }}
      },
      ...
    }

naqvis avatar Jul 05 '22 06:07 naqvis

Any input or feedback is appreciated

naqvis avatar Jul 09 '22 13:07 naqvis

Hey @naqvis, thanks for the detailed proposal. This is definitely an interesting topic worth reviewing and discussing in greater detail. I am wondering if you would be able to draft the proposal in a Google doc so that it facilitates easier discussion via comments on the doc. Here's a design doc I drafted recently for your reference. Let me know if that's feasible.

Thanks

shashankram avatar Jul 11 '22 16:07 shashankram

I'll add that https://github.com/openservicemesh/osm/issues/4862 and https://github.com/openservicemesh/osm/issues/4863 have implications in how we will change:

  • Endpoints providers
  • Internal MeshCatalog representations like Policy objects
  • K8s clients
  • Possibly other internals as well

My gut tells me that attempting an additional refactor simultaneously with the two linked above would be untenable, but a draft design doc is always welcome!

As another data point, I don't think OSM wants to be in the business of supporting an additional sidecar implementation at this point in time. However it may be more palatable to expose endpoints that could be queried for OSM internal objects like, MeshService, endpoints, policies, etc (again note that some of these internal objects are likely to have some changes due to the above refactoring). With that, a separate binary could be responsible for configuring the sidecars.

I should note that I say all of the above as my personal opinion, without having conferred with the rest of the maintainers.

steeling avatar Jul 11 '22 20:07 steeling

Thanks @shashankram and @steeling

I am wondering if you would be able to draft the proposal in a Google doc so that it facilitates easier discussion via comments on the doc.

Sure I'll take a look at design doc and try to draft this proposal in google doc format.

As another data point, I don't think OSM wants to be in the business of supporting an additional sidecar implementation at this point in time.

I assume there is a confusion here, proposal is to refactor existing design so that its no longer coupled with reference implementation. Proposal provides links to reference implementations as a proof of concept to support refactoring works. We in no means are requesting OSM to integrate and support Pipy sidecar implementation as its another reference implementation.

naqvis avatar Jul 12 '22 01:07 naqvis

Hey @naqvis, thanks for the detailed proposal. This is definitely an interesting topic worth reviewing and discussing in greater detail. I am wondering if you would be able to draft the proposal in a Google doc so that it facilitates easier discussion via comments on the doc. Here's a design doc I drafted recently for your reference. Let me know if that's feasible.

Thanks

I've created proposal doc and granted editor rights to all.

naqvis avatar Jul 12 '22 12:07 naqvis

As a heads up, we've allocated #4862 to our v1.3 release. I think trying to apply another layer of refactoring/abstraction simultaneously would lead to the incorrect layers of abstraction being drawn, so I think it best we revisit this after the v1.3 release

steeling avatar Jul 20 '22 18:07 steeling

This issue will be closed due to a long period of inactivity. If you would like this issue to remain open then please comment or update.

github-actions[bot] avatar Sep 19 '22 00:09 github-actions[bot]

Commeting here for posterity, but https://github.com/openservicemesh/osm/pull/5109 tackles this decoupling in a different way. https://github.com/openservicemesh/osm/issues/5112 will help move that forward too. Note that now the envoy.Proxy object contains no specific envoy references, so will become osm.Proxy.

There's still a lot in this current proposal with respect to different sidecars specs, but I think more of that could be hidden within the new Generator type, instead of us supporting multiple proxies directly by creating Spec's for sidecar drivers.

And there's still more we can do to abstract over this. In particular the injector lacks a lot of abstraction, so we can push more of that behind interfaces as well to decouple the main logic from both k8s and envoy. The PR and issue linked in this comment will conclude a bulk of this quarter's refactoring however

steeling avatar Sep 19 '22 21:09 steeling

This issue will be closed due to a long period of inactivity. If you would like this issue to remain open then please comment or update.

github-actions[bot] avatar Nov 20 '22 00:11 github-actions[bot]

Issue closed due to inactivity.

github-actions[bot] avatar Nov 28 '22 00:11 github-actions[bot]