fluvio [Feature] Run without Kubernetes

Have you a simple way to run fluvio without Kubernetes?

My orchestrator is Hashicorp Nomad

Feb 13 '22 20:02 sycured

Currently, Fluvio is designed to run on top of Kubernetes. We have a long-term plan to decouple from Kubernetes to allow it to run on different orchestrators. What is your use case and how soon do you need it so we can plan accordingly?

Feb 15 '22 15:02 sehz

I see fluvio as a Kafka replacement in my use case so less java and more rust

My setup is easy: 0 kubernetes, full Hashicorp Nomad as orchestrator because it's the only running on multiplatforms:

Linux with Podman
FreeBSD with pot also macOS

I can't give you a date about when I'll need it outside of Kubernetes because actually, I'm on dev/mvp of the platform and I've a Kafka cluster to ensure this stage but removing Kafka and the JVM is very interesting

Feb 15 '22 18:02 sycured

What do you use Kafka for and do you need any connector support?

Feb 16 '22 14:02 sehz

Actually, we do the poc with Kafka, connectors to PostgreSQL (debezium and sink)

All other parts are in Python or Rust so nothing special

Special thing: intensive usage of Avro format, we don't use JSON between services

Feb 16 '22 15:02 sycured

Thanks for the info. Can you share how much memory each broker consumes and the type of machine instance? We are trying to perform a baseline comparison with Kafka.

Feb 16 '22 15:02 sehz

I use Oracle Cloud at this time for the POC: Ampere (ARM) A1 instance with 2 vCores and 12 GB of memory Honestly, they are overkill about memory so I'm not the best for those stats at this time, sorry

The other alternative that I'm thinking to benchmark is NATS

https://www.oracle.com/cloud/compute/arm/#:~:text=Oracle%20Cloud%20Infrastructure%20offers%20Ampere,cache%2C%20and%20delivers%20predictable%20performance. https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/apache-kafka-benchmarks-on-aws-graviton2

Feb 16 '22 20:02 sycured

Great. Thanks!

Feb 16 '22 20:02 sehz

Kubernetes coupling is really a killer. Personally, I wish I could test it to see if it could be used for market data ingestion as a replacement for Redpanda.

BTW, as an alternative to Kafka, there is Redpanda which is extremely simple to install (just a binary...) and configure, and in C++ with WASM support. Also, it doesn't need a "controller", instead uses the Raft consensus for the deployment cluster. It avoids the common issue of having an entire cluster down if the single controller is down...

See comparisons here: https://redpanda.com/blog/fast-and-safe/

Fluvio as a Rust competitor with more safety and less conformity with the Kafka API could really prove worthy. As soon as the need for K8s is abandoned...

Feb 25 '22 22:02 arbfay

@arbfay thank you for the feedback. Fluvio was designed for hyper-scale deployments to handle huge volumes of data. In large environments, SPUs may be physically located on an edge device or different geo-location. In such situations, there is a need for a controller that sits outside of the SPU to ensure the overall health of the distributed system. Hence the need for separation between Controller and the SPUs.

Although, we do agree that the dependency on Kubernetes is a headache for small environments. We could use some help from the community on solving this issue.

Feb 25 '22 22:02 ajhunyady

@arbfay thank you for the feedback. Fluvio was designed for hyper-scale deployments to handle huge volumes of data. In large environments, SPUs may be physically located on an edge device or different geo-location. In such situations, there is a need for a controller that sits outside of the SPU to ensure the overall health of the distributed system. Hence the need for separation between Controller and the SPUs.

For example, if we have health check via Consul (Service Mesh and Registry), Fluvio can use it to know which other instance is up and running by looking at active.fluvio.service.consul

With Kafka, it's exactly what I do when I need to configure a consumer or producer, the list of brokers is built using Consul ((active.)kafka.service.consul)

It's just an idea

Although, we do agree that the dependency on Kubernetes is a headache for small environments. We could use some help from the community on solving this issue.

Sorry but for example Hashicorp Nomad runs Cloudflare services: https://blog.cloudflare.com/how-we-use-hashicorp-nomad/

Production without Kubernetes is absolutely possible and it's more KISS from UNIX philosophy (Keep It Simple Stupid)

Feb 26 '22 14:02 sycured

Our reliance on Kubernetes was based on assumption that it was mostly widely deployment platform and we want to make it easier to deploy on it. We take advantage of Kubernetes feature like deployment management to run connector in a single unified platform which RedPanda or Kafka doesn't do. But seems like there is need to support other scheduler such as Nomad.

Feb 26 '22 18:02 sehz

Related to #1558

Jul 26 '23 20:07 digikata

Closing this issue as completed. Running a fluvio cluster without kubernetes is supported by default now fluvio cluster start or fluvio cluster start --local.

Feb 21 '24 22:02 digikata

fluvio fluvio copied to clipboard

[Feature] Run without Kubernetes

fluvio
fluvio copied to clipboard