traefik-proxy
traefik-proxy copied to clipboard
kv store support and clients
We're in a bit of a weird situation with Key-Value (KV) store support. There don't appear to be any maintained clients for etcd or consul in Python, which is a bit weird. Traefik supports several KV stores, and we happened to pick etcd and consul. Not for any hugely specific reason, but they are single binaries, which makes them easy to install.
We've been using https://github.com/kragniz/python-etcd3 which is mostly unmaintained, and a breakage in grpcio prompted a few "works for me" forks, which may or may not take over, or end up abandoned, too. https://opendev.org/openstack/etcd3gw appears to be maintained, but doesn't seem to be meant for use by anyone, given its lack of documentation or any publicly-facing bug reporting, contributions, or anything, and the fact that roughly the only thing in its docs - a pip install
command - has the wrong package name. python-consul2 also appears to be abandoned with no real candidate for an alternative.
grpcio/protobuf in general seems to be not a good stack for Python clients, which I think would be better served with far simpler, more stable http APIs.
I don't think we really care what's used, and the Python redis API situation is far healthier than etcd or consul. All we really care about is being able to support multiple traefik replicas in z2jh.
I think bootstrapping the KV store is far less important than traefik itself, because any situation where a KV backend is used, the KV store is almost guaranteed to be run separately via a container (there's no real reason to use KV on a single machine like littlest-jupyterhub, where files work just fine), so there's ~no situation where I imagine the install.py
bootstrapping of a kv store to be useful in practice, and certainly not worth the relatively high maintenance cost of keeping install.py updated vs the small cost of end-users installing a single binary of their choice.
So the question is:
- what KV stores do we support?
- what tools do we support installing ourselves (just traefik, or traefik, etcd, consul, etc.)?
I currently think we should:
- remove etcd and consul from install.py, leave it just for traefik
- maybe deprecate consul support altogether (don't delete it because it works, but don't put more effort into maintaining it)
- consider adding redis, as the far healthier option on the Python side
- consider rewriting etcd to use HTTP instead of any etcd3 Python client. Our uses are so minimal, that this may be the simplest approach
I think supporting redis (as the default?) distributed KV store makes sense. It's widely used and understood, the python module is maintained by redis, it's easily runnable as a container/helm chart as well as a fully managed cloud service, and if you did want to install it on a VM it's most likely in your Linux distribution repository.
I don't think we need to support installing redis/etcd/consul in the installer since there's a file backend https://github.com/jupyterhub/traefik-proxy/blob/main/jupyterhub_traefik_proxy/fileprovider.py
This would be similar to how JupyterHub supports multiple databases like PostgreSQL and MySQL, but only sqlite is supported out of the box, and we don't include other databases as part of the installation process.
I'm +1 in keeping only traefik in the installer.
About kv stores supported, I'd also advocate for adding redis from a maintability point of view and deprecate both consul and etcd.
consider rewriting etcd to use HTTP instead of any etcd3 Python client. Our uses are so minimal, that this may be the simplest approach
I believe this makes sense to be in an issue that could be implemented when or if need be.
Does anyone remember the background decisions that led to the choice of consul and etcd? This would help us decide whether to keep, deprecate or drop them.
Agreed, I also thought install.py
was a little unnecessary, and agree that it adds an unnecessary maintenance burden, with having to change the checksums, etc. I was completely unaware that python-etcd3
and python-consul
were no longer maintained, though.
On a slightly related, but separate note...
I guess (because I've never deployed a Kubernetes cluster) TLJH describes how to deploy a Kubernetes cluster with jupyter-traefik-proxy
running as a service. Personally, I use docker-compose
to run jupyterhub
with jupyterhub-traefik-proxy
in one service and traefik
in another service (actually in a completely separate docker-compose
project). I don't bother with etcd or consul backends, as I run this on a single host, so I personally find the high availability backends unnecessary. What I'm getting at, is I think an example / minimal docker-compose file and related config files and documentation would be useful. Thoughts? I'm appy to put some time into this.
Does anyone remember the background decisions that led to the choice of consul and etcd? This would help us decide whether to keep, deprecate or drop them.
IIRC (maybe @GeorgianaElena remembers better), etcd was selected as the first, just because it was the first and simplest kv store that came to mind. We picked up consul due to apparent performance issues with etcd (#56). Both being simple go binaries also makes them easy to install/deploy, e.g. for tests, but I don't think they were chosen with great care.
traefik config-loading seems to be incredibly slow compared to CHP, but we need to revisit the benchmarks to get an updated comparison (#163). Maybe we can get redis in there as well. I can't seem to find a benchmark of traefik's KV performance for different providers. The main consideration is traefik key-value watch performance, which does seem to vary across KV implementations, at least in traefik 1.x.
Exploring consul clients a bit more, there's:
- hc-pyconsul, which appears to be brand new and active, but only created/used by one person so far
- py-consul is a slightly less outdated fork than python-consul2, but explicitly temporary fork that doesn't allow Issues so isn't really planned as a stable client
after #185, it should be a lot easier to add KV implementations like redis, since only 3 methods need to be implemented - generic methods to add, remove, and get keys from a kv store.
btw, I found etcd3gw's development page, which I couldn't find last time since all of its official links are broken. It's still clearly actively maintained, but shows all signs of being a purely internal tool, not meant for public use:
- The absolute lack of docs (last updated: 2017)
- no responses to most bug reports
- no support for authenticated access to etcd3 (a basic requirement for using it), though calico has a tiny subclass to add it
The next time etcd breaks, if it happens, I think we should either:
- drop etcd3 support, or
- use etcd3gw and vendor Calico's auth-adding subclass
Great to see redis support getting merged! Is this going to be released anytime soon?