nelson icon indicating copy to clipboard operation
nelson copied to clipboard

RFC: Nelson routing implementation for Kubernetes backend

Open adelbertc opened this issue 6 years ago • 6 comments

  • Feature Name: Nelson routing implementation for Kubernetes backend
  • Start Date: June 2, 2018

Summary

Currently the Kubernetes backend for Nelson does not support Nelson's routing function which prohibits service discovery, service-to-service calls, and traffic shifting.

The Magnetar (Hashistack) workflow's implementation of routing centers around Lighthouse, more specifically the Lighthouse protocol. As each workload is deployed, Nelson generates the corresponding Lighthouse protocol for that workload and writes it to Consul, using the workload's stack name as the key (ref00, ref01, ref02). If the workload is a patch update to an existing workload, a traffic shift is indicated by writing a corresponding row into the database which will be picked up by the routing cron (ref00, ref01, ref02, ref03, ref04). This information is later pulled by clients, likely with the Lighthouse client, to actually do the traffic shift.

This RFC explores the integration of Lighthouse into a Kubernetes environment.

Explanation

One of the big differences between a Kubernetes environment and a Hashistack one is the prescriptive nature of Kubernetes. Where Hashistack provides Consul, a generic distributed key-value store with which users can plug in their own service discovery logic, Kubernetes pushes for using Kubernetes Service objects. A Service object essentially groups a set of containers together (likely replicas of the same service) using label selectors and makes them automagically accessible through a single address, similar to a load balancer. Service objects get assigned a cluster-internal DNS name and IP address, and if the Service is of type LoadBalancer it may also create an external load balancer depending on the provider (e.g. AWS, GCP). This tends to be the "recommended" way to do service discovery in Kubernetes.

The difficulty then lies in that there is no generic key-value store available in Kubernetes, which the current Lighthouse integration needs since it dumps a JSON blob (it is worth noting that this is just what Lighthouse currently does, but other serialization mechanisms are definitely possible). Kubernetes does depend on etcd, but as far as I know the etcd instance is internal and not designed to be used in general. Service objects can have labels themselves, but these labels are quite limited and are not designed to be as general purpose as Consul.

Design

I see three possible implementations for this RFC:

  1. Require operators to deploy a Consul or similar key-value store and use the existing mechanisms
  2. Require operators to deploy a dedicated Lighthouse service which can shim between environmental differences
  3. Change how the Lighthouse protocol is serialized for Kubernetes

The first two require minimal code change but comes with the downside that another stateful and highly-available system be maintained. In the case of (2) we would also be effectively breaking from and competing with the rest of the existing Kubernetes ecosystem (see Istio note in the "Alternatives" section). On the upside, especially in (2), client libraries need only be programmed against a generic API surface and any differences in environment can be handled in how the service is deployed.

Among the niceties of Nelson however is that it is just a single service and is designed to be deployed in an existing environment, reusing the ecosystem where possible (e.g. in Kubernetes Nelson namespaces are Kubernetes namespaces, Nelson healthchecks use Kubernetes healthchecks, etc.). Therefore the rest of this RFC will focus on the third design.

Modifying the Lighthouse protocol

The Lighthouse protocol of a given service is a JSON blob written to a key locatable given the service's stack name. The JSON blob contains:

  • The default namespace
  • DNS domain/TLD for the Consul instance
  • For each namespace in the datacenter:
    • the service name
    • the service stack name
    • the target port
    • the protocol for the port (e.g. http)
    • the weight

The first two as far as I can tell are not actively used by Lighthouse, and the second does not make sense in a Kubernetes environment. The rest however can be written as individual labels on a Kubernetes Service named after the stack name. Given a stack name it is easy to locate a Service with that name, as well as get its corresponding labels. Since Kubernetes comes with its own notion of namespaces (and the Kubernetes backend ties Nelson namespaces with this feature), there is no need to have to put service discovery information for a service for each namespace in a single object - each Service in each Kubernetes namespace just contains labels for itself. Therefore for this Lighthouse blob:

{
  "namespaces": [
    {
      "routes": [
        {
          "port": "default",
          "targets": [
            {
              "weight": 100,
              "protocol": "http",
              "port": 9000,
              "stack": "howdy-http--1-0-388--aeiq8irl"
            }
          ],
          "service": "howdy-http"
        }
      ],
      "name": "dev"
    }
  ],
  "domain": "your.consul-tld.com",
  "defaultNamespace": "dev"
}

the corresponding Kubernetes environment might have a Service object in the dev Kubernetes namespace called howdy-http--1-0-388--aeiq8irl with the labels:

  • service=howdy-http
  • portName=default
  • port=9000
  • protocol=http
  • weight=100

A Lighthouse client for Kubernetes would then talk with the Kubernetes API server, generally locatable from a Pod under the cluster DNS name kubernetes.default.svc, query for a Service using the stack name, and from there inspecting the labels.

Drawbacks

This label-based approach is a different serialization of the Lighthouse protocol and thus requires it to be consumed by a different Lighthouse client implementation. The existing protocol assumes a generic key-value store to be available but given where the Kubernetes ecosystem is going, it seems Service objects or similar are the best ecosystem-friendly approach.

Alternatives

Istio is a control plane implementation that sits on top of Kubernetes that provides many of the features discussed here like versioning and traffic shifting. At the time of writing Istio is still in an experimental stage but in the future it may be possible to integrate Nelson and Lighthouse with Istio. However this too will likely require a change in the Lighthouse protocol serialization, but may ease the burden on consuming clients since Istio has its own middleware for actually doing the traffic shifting.

Implementation steps

  • Add a step to the Canopus (Kubernetes) workflow that creates a Service per deployment with the appropriate labels
  • Add logic in the routing cron that, when in a Kubernetes environment, works with Services and labels during a traffic shift
  • Create a new Lighthouse client that consumes the described Kubernetes Lighthouse serialization scheme

adelbertc avatar Jun 03 '18 01:06 adelbertc

Firstly, thanks for writing up this great explanation; its a good jumping off point for us to get some kind of discussion and work started on this much-needed feature.

The possible solutions section is probally fair. The only other option I can think of for completeness would be that we could externalize some of this pain, such that there exists a runtime lighthouse service that Nelson is integrated with. This has two upsides:

  1. Client libraries can be made stable and would not need integrations up-the-wazoo; just integrations with this one service protocol. This would likley be a slim gRPC service. Naturally this doesn't eliminate the pain, but rather, it just pushes it around to a point in the system that is easily revisable (as opposed to being sharded through every client system)

  2. Minimizes changes to nelson itself for arbitrary new routing systems.

Conversely, it has the following downsides:

  1. We're directly competeing with Istio on K8S (although we're generic and they are not).
  2. We then force users to have a high-availabiity component (suppose we are just making this explicit, as they already had to do that with Consul)

Generally speaking, i've been burnt by libraries and having to evolve them in lock-step, so as I write this now im wondering if we could actually have a runtime lighthouse service and then somehow integrate a filter in Envoy (not disimilar to the RateLimit filter) that does the mediation between lighthouse service and the users application? That idea is a little half-baked, but on the face of it could work.

Lighthouse was origionally implemented to solve the problem of context propagation for namespaces and - later - tracing and experimental data. As a user-facing API, I think that will still make sense in most environments, however, I am a bit worried about how different environments might need to behave if we dont implement a lightohuse service? Food for thought.

Other notes

Couple of notes below about things in the RFC here that are not quite accurately stated:

The difficulty then lies in that there is no generic key-value store available in Kubernetes, which the Lighthouse protocol needs.

So let's be clear here: the lighthouse protocol is a description of routing trees. The present lighthouse implementaiton is expecting it to be available in Consul in a JSON format, but this is not an intrinsic part of the protocol. Rather, it is an implementation detail. We should consider revising the protocol format entirely (if needed), or simply make other options available.

timperrett avatar Jun 03 '18 16:06 timperrett

@timperrett Good point on the Lighthouse service option, added it to the RFC.

The Envoy filter idea is interesting, though one of the niceties of Nelson I think is how it has pluggable backends for different components. Depending on how we implement the integration with Envoy I fear Nelson becomes too prescriptive on that front.. perhaps we can come up with a revision of Lighthouse that Nelson exposes and then provide an Envoy filter people can grab off the shelf, but should anyone want to use their own proxy (or even library) they still can.

Can you expand a bit on the Lighthouse bit, like what it was originally designed to do? I have always viewed it as just a way for Nelson to publish generic routing information for a given deployment that dependents can pull and use.

Also will add clarification about the Lighthouse protocol.

Justin made a good point on Twitter about Kubernetes CustomResourceDefinitions which essentially allow us to extend the kinds of queries we can make to the Kubernetes API server. As far as I can tell this is almost a sort of hook into the otherwise hidden etcd instance.

An initial look opens up another possible implementation route which is to create a Lighthouse resource definition and serialize the Lighthouse protocols for services through there. It seems such a resource could be scoped to either the entire cluster which would mimic the current Consul integration, or it could be namespaced. Given that the standard Kubernetes objects are namespaced it seems for this use case namespacing is the way to go, curious if anyone has thoughts on tradeoffs between making Lighthouse information cluster-wide vs. namespaced.

One thing that concerns me though is it seems this would overlap with what Services were intended to do. Also it seems with defining our own resource definition we no longer get things like a cluster-internal DNS + IP address for the selected Pods for free anymore.

adelbertc avatar Jun 03 '18 17:06 adelbertc

To update this thread on where i'm at with my thinking: presently devising a scheme to integrate https://github.com/turbinelabs/rotor into Nelson, so that we get the control plain and service discovery off of the shelf. This moves Nelson out of the hot path as much as possible, and means that we are building less proprietary things. Rotor for example will "just work" with either k8s or Consul.

Similarly, whatever gets built for the Rotor integration should also support Istio Pilot (not mixer, i'd want to ignore that). At this point i'm largely wanting to integrate with things the community is converging on, instead of pushing down the Lighthouse path (although I plan on having some kind of library / subsequent version of Lighthouse)

timperrett avatar Aug 10 '18 04:08 timperrett

can you expand on how rotor is used? this is another container we launch that does what where? I'm just not familiar with this project at all?

I'd be thrilled if we were in general less prescriptive, but discovery is the super hard problem we might be in a good place to help with, so I think if we have any way to be prescriptive about discovery that might work in lots of different environments I'd be all over providing easy happy paths

stew avatar Aug 10 '18 06:08 stew

@stew sure. The simplest rotor implementation is where you have a central Rotor that does service discovery (based on k8s labels, not dissimilar to what was proposed above by @adelbertc), and provides Envoy xDS APIs (the new v2 gRPC streaming ones) as a service (filling the gap left by consort). You then run a sidecar envoy per-pod, configured to talk to that Rotor. Rotor in turn is configured via a set of route definitions, which i'm planning on generating from Nelson (to replace what we generated with Lighthouse).

timperrett avatar Aug 10 '18 14:08 timperrett

instead of pushing down the Lighthouse path (although I plan on having some kind of library / subsequent version of Lighthouse)

I think having a non-{Istio, Rotor} solution, at least until the community truly converges on either of those solutions will still be useful. This is also a selfish suggestion of mine since our K8s team isn't deploying Istio near-term so we'll need to get by on Service in the interim. But I also think it's useful since using Services is a "valid" way of doing service discovery/service-to-service in K8s and it would add to the non-prescriptive-ness. Between Services, Istio, and Rotor I think we would cover a majority of the service discovery solutions used in Kubernetes.. at least for the next year when inevitably another ten come out.

In any case I'm likely going to have to implement this either here upstream or in a minorly vendored Nelson internally with that Services-based routing.

Perhaps one thing that might guide how we do this is, do you see your Rotor/Istio integration requiring #79 to be implemented? If so we can do the Services-based smart client thing in v1 of the manifest since that's all we can support now anyways, and potentially see if #79 opens the doors to both Services and a Rotor/Istio solution, WDYT?

adelbertc avatar Aug 10 '18 16:08 adelbertc