k3s icon indicating copy to clipboard operation
k3s copied to clipboard

Support embedded NATS as alternate cluster option to etcd

Open bruth opened this issue 1 year ago • 22 comments

Is your feature request related to a problem? Please describe.

Currently, embedded HA is supported only by etcd. With the option of embedded NATS that was added to Kine (as of v0.10.0/v0.10.1), NATS can be another option since it supports native clustering as well.

Describe the solution you'd like

Add native support for NATS as an alternative cluster option when doing --cluster-init.

Describe alternatives you've considered

There are no other native options, however, using external NATS configuration (when configuring the --datastore-endpoint), the nodes can be clustered without the k3s layer being aware that it is clustered. This provides HA/FT of the KV data, but k3s is unaware of this and not technically running in clustered mode.

Additional context

I plan on contributing this, but any guidance or things to be aware of is welcome!

bruth avatar May 08 '23 10:05 bruth

Looking forward to seeing this

gedw99 avatar May 15 '23 17:05 gedw99

cc @rancher-max @cwayne18

brandond avatar May 15 '23 18:05 brandond

This is cool and a great feature suggestion! Thank you!

I have some clarifying questions to determine how deep down the proverbial rabbit hole we should go:

  1. Is k3s expected to supply backup/restore functionality? a. Would this extend cluster-reset/cluster-reset-restore-path functionality? b. Would it be a new command? c. Does it follow nats' approach or is it done differently?
  2. Should an operator be able to run NATS in their cluster while also using it as the embedded datastore?
  3. Should NATS certs be rotated during manual certificate rotation? a. What is the expectation when an operator provides their own certs? Ref: https://docs.k3s.io/cli/certificate#using-custom-ca-certificates and specifically the note: etcd files are required even if embedded etcd is not in use.

rancher-max avatar May 15 '23 19:05 rancher-max

Those are all good questions!

At the moment I see the embedded NATS as a replacement for sqlite only; while it is possible to host a multi-node cluster using the embedded NATS server, @bruth or someone on his team will need to provide instructions on how to set this up as I believe it requires a user-managed config file to accomplish.

If it is desired that K3s support multi-server clusters by managing the configuration and cluster membership, allow for backup/restore using the embedded NATS datastore, and all the other stuff that would provide complete parity with the embedded etcd datastore, I think that would also need to be driven by someone on the Synadia side.

brandond avatar May 15 '23 19:05 brandond

i agree in that some Ops aspects need to be added or documented.

gedw99 avatar May 16 '23 08:05 gedw99

need to provide instructions on how to set this up as I believe it requires a user-managed config file to accomplish.

This can be accomplished programmatically without config files for this particular setup. The Kine integration relies on the NATS server package which makes all of the config options available to be configured.

Since this would be a k3s feature, we would likely need to add support for additional query params on the Kine endpoint to indicate "cluster-mode" for example. But that design can get worked out to prevent needing users to manually define config files. It should be opt-in if they want more control, but not required.

the other stuff that would provide complete parity with the embedded etcd datastore, I think that would also need to be driven by someone on the Synadia side.

That is the intent for sure and why I am looking for guidance to understand the scope of complete parity! I don't want to boil the ocean in one pass if there is too much, but this is a good first list.

  1. Is k3s expected to supply backup/restore functionality?

If this functionality sits behind an interface, then we can hook in NATS standard method of backing up stream/consumer state as well as restore. I will need to read up on what k3s does today to compare.

  1. Should an operator be able to run NATS in their cluster while also using it as the embedded datastore?

They certainly should be able to run an additional server/cluster in k3s itself independent of the embedded one if they choose to. They shouldn't need, however I could understand the argument that they don't want to mix k3s and application concerns or the potential for applications impacting the embedded server/cluster and prefer a clear boundary.

One could say the same about etcd, but one distinction with NATS is that with it's multi-tenancy support, the k3s/kine state and messaging would be completely isolated from any applications.

In terms of recommended approaches, have a set of use cases and/or considerations in whether to reuse the embedded cluster vs. running another container should be sufficient for people to make that decision.

  1. Should NATS certs be rotated during manual certificate rotation?

Based on the link it looks like k3s is temporarily shutdown to do the cert rotation? That would certainly work for NATS as well. Custom CAs can be set in NATS config as well.

bruth avatar May 16 '23 11:05 bruth

Hey @VestigeJ, I saw you assigned this to yourself! Are you actively working on this or interested in collaborating?

bruth avatar Jun 05 '23 17:06 bruth

Hey @bruth I DM'd you back on your home Slack if you want to work together I'd be more than happy to. :)

VestigeJ avatar Jun 05 '23 22:06 VestigeJ

@bruth Did this get put onto a back burner on the Synadia side?

VestigeJ avatar Sep 07 '23 23:09 VestigeJ

@VestigeJ I think we're waiting on

  • https://github.com/k3s-io/kine/pull/194#issuecomment-1699626340

brandond avatar Sep 08 '23 02:09 brandond

@VestigeJ if it has been put on a back-burner then it would be very unfortunate that @bruth chose to highlight it on a recent podcast.

udf2457 avatar Oct 09 '23 22:10 udf2457

@udf2457 that comment is probably best directed at @bruth himself, not anyone on the K3s team. NATS support is maintained by the Synadia folks.

brandond avatar Oct 09 '23 23:10 brandond

@udf2457 This was a temporary back burner.. focus has been on the NATS 2.10 release the past couple months. The KINE PR works, but there are a couple remaining subtle recovery issues to address (likely tweaking a couple timeouts). Now that it is out, focus is shifting back and will have an update next week.

bruth avatar Oct 09 '23 23:10 bruth

Hey folks, just giving a quick update so it doesn't get lost in the void again. I made some more progress today on the Kine PR (k3s-io/kine#194), including porting the client code to the new JetStream API. I am debugging a few remaining things, but planning to have it ready for review and merge early next week.

As it pertains to this issue, it will support HA mode without needing to change anything in k3s itself. This is a simpler option/better outcome IMO given how intertwined etcd as a dependency is (outside of kine).

Regarding backup/restore this can be achieve out-of-band using standard NATS utilities. If there is a strong desire to get them baked into k3s utilities, I am happy to move that along along.

bruth avatar Oct 26 '23 20:10 bruth

Converted https://github.com/k3s-io/kine/pull/194 to ready for review. There are some final bits to clean up and testing a couple failure cases, but in a good spot. Docs will come in the next couple days.

bruth avatar Nov 01 '23 00:11 bruth

Bumping this back out; embedded nats support is still disabled by build flag. We'll need to add -tags nats to the K3s build flags to enable this.

At the moment nats only supports external servers.

brandond avatar Nov 16 '23 01:11 brandond

@brandond Other than documentation, what would be helpful to have this be supported in v1.29?

bruth avatar Nov 16 '23 01:11 bruth

Docs would be good, and maybe get a PR open now to add the build flag so we can see what the current size impact is?

brandond avatar Nov 16 '23 04:11 brandond

Looks like it adds about 2MB to the K3s size. I'm seeing the binary go from 58MB to 60MB

derek@degion:~/rancher/k3s$ ls -lh ./dist/artifacts/
total 247M
-rwxr-xr-x 1 derek derek  60M Nov 16 09:54 k3s

dereknola avatar Nov 16 '23 17:11 dereknola

Testing note - stalled currently for December or January releases

VestigeJ avatar Nov 30 '23 23:11 VestigeJ