apisix-ingress-controller proposal: New architecture of Apache APISIX Ingress controller

proposal: New architecture of Apache APISIX Ingress controller

Open tao12345666333 opened this issue 2 years ago • 9 comments

In the current architecture of the Apache APISIX Ingress controller, we use the Apache APISIX Ingress controller as a control plane component. The user creates a specified type of CR in Kubernetes, and the Apache APISIX Ingress controller converts it into a data structure that can be received by Apache APISIX, and creates, modifies or deletes it by calling the admin API. Such an architecture has the following advantages:

The separation of CP and DP can ensure that even if the CP component is abnormal, DP can still run properly;
Users can deploy DP in any location they like, including outside the Kubernetes cluster

But such an architecture will also have its disadvantages Users need to maintain a complete Apache APISIX cluster, which cannot be done simply by modifying the replicas field of the Apache APISIX Ingress controller

I hope to introduce an architecture similar to ingress-nginx, which is widely used in Kubernetes.

In this way, users can complete the deployment directly through a Pod. At the same time, user can simply modify the replicas parameter to complete the scale.

sync from mail list: https://lists.apache.org/thread.html/r929a6dfa9620d96874056750c6b07b8139b4952c8f168670553dfb86%40%3Cdev.apisix.apache.org%3E

Jul 30 '21 11:07 tao12345666333

Agree +1

Jul 30 '21 11:07 gxthrj

Aug 01 '21 02:08 juzhiyuan

Aug 01 '21 11:08 tokers

This issue has been marked as stale due to 90 days of inactivity. It will be closed in 30 days if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.

Jun 29 '22 01:06 github-actions[bot]

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

I guess it will be

APISIX Ingress (gRPC server)  —>  gRPC client
                                      ｜
                                       -> APISIX standalone mode

APISIX may become a child process managed by another component we implement.

Aug 15 '22 09:08 tao12345666333

What about using an ETCD adapter to let the custom component support ETCD APIs so that we can avoid any changes for APISIX.

Aug 15 '22 09:08 tokers

APISIX standalone mode will fully update the configuration, which will have some impact on health checks or caching.

Aug 15 '22 10:08 tao12345666333

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

I guess it will be
APISIX Ingress (gRPC server)  —>  gRPC client
                                      ｜
                                       -> APISIX standalone mode
APISIX may become a child process managed by another component we implement.

Why use gRPC ? gRPC is more complex than rest apt. we should definition more and more protbuf file and increase the code complexity.

I recommend use default APISIX admin api.

Aug 16 '22 12:08 sober-wang

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

If there is no storage component, then we will drop the Admin API @sober-wang

Using gRPC allows for active push configuration via the server. Even introducing xDS here is an option

gRPC is more complex than rest apt. we should definition more and more protbuf file and increase the code complexity. It is normal for new features to introduce some code changes.

While the current mode is really simple, obviously I want it to be more powerful. I won't stop it for fear of needing to write code or adding complexity

Aug 19 '22 19:08 tao12345666333

I'm new to APISIX so I started reviewing the Architecture and the Deployment modes, alongside the documentation of the Ingress Controller itself (since my goal is to manage APISIX via Kubernetes CRDs and use it as an alternate Ingress Class for K8s Ingresses)

If my understanding is correct, the APISIX Ingress Controller makes the APISIX Control Plane (and the Admin API for that matter) almost entirely obsolete (at least in the context being discussed here).

If the actual configuration of APISIX is now done via Custom Resources (which are ultimately persisted in the etcd of the cluster itself) why would we need another etcd cluster to persist configuration data? I'm guessing that the current architecture of the Ingress Controller was meant to be a functional adapter for the APISIX Admin API without having to introduce any significant changes to the latter whilst still making it possible to use APISIX in the context of a Kubernetes Cluster (and its resources).

While it does what it's meant to do, it also introduces certain problems, some of which deserve serious consideration:

the etcd cluster installed with APISIX in the Traditional mode does not get backed up through regular cluster backups
in order to be able to talk to the Admin API and configure routes, upstreams, etc. the Ingress Controller needs to store the admin user key in a Config Map which is essentially insecure and makes it harder for credentials to be rotated. Setting those keys in Helm values also isn't a good solution since we store all the values in our SCM. The best would be if we could get those values from Vault. [edit] Just found out about this issue which goes in line with what I said here [/edit]
the Ingress Controllers' IP addresses need to be hard-coded in the APISIX Control Plane configuration if we want to take advantage of the admin.allow.ipList. It can be quite challenging to get those values when installing both charts via Helm (without hard-coding them and they are essentially dynamic) so you end up having to allow the whole Cluster Network range or 0.0.0.0/0
maintaining an extra Control Plane and an extra etcd cluster means more overhead for the Platform teams and introduces more points of failure. [edit] As an illustration of this, check how many of the apisix-helm-chart repo's issues are related to etcd alone (37 of 80) [/edit]

It would be great if the Ingress Controller could talk directly to the Data Plane in standalone mode.

Nov 21 '22 13:11 macmiranda

You are right!

That's the main reason why I came up with this idea.

This will be my third priority, I will deal with #1465 first and then release v1.6. Then I will start working on this one.

It won't be long before I post my thoughts 💡 here to discuss with you all

Nov 21 '22 15:11 tao12345666333

2023-01-18 09-56-25屏幕截图

I have a new idea.

Since APISIX v3 has added the capability of gRPC-client, some optimizations have been made to the CP/DP deployment model in APISIX v3. So we can apply this model in APISIX Ingress, implement a gRPC server similar to ETCD in APISIX-Ingress-Controller, let it serve as the control plane, and APISIX, which is actually a data plane, connects to it through gRPC.

In this way, the data plane APISIX is exactly the same as normal APISIX, not in Standalone mode, so you can use all the capabilities of APISIX without any modification to APISIX.

WDYT?

Jan 18 '23 02:01 tao12345666333

Sounds good to me though I'm not the most familiar with the APISIX architecture, specially not when it comes to the gRPC components. One question though, would the Ingress controller also need some type of state store or would it work fine with just reading the state from the kubernetes resources? Also not sure how the client would authenticate to the server. Would mTLS be an option?

Jan 21 '23 13:01 macmiranda

In the new architecture, the ingress controller is a stateless component.

It can just read and store resource status in Kubernetes's resources.

For authentication, we can add certificates to protect the connection.

Jan 29 '23 07:01 tao12345666333

Is it possible that controller just modify the apisix.yaml configmap thus other standalone apisix instances can watch these changes

Feb 23 '23 03:02 caibirdme

Is it possible that controller just modify the apisix.yaml configmap thus other standalone apisix instances can watch these changes

@caibirdme no, this is not designed for standalone mode.

Are you using standalone mode? I want to understand your use case

Mar 09 '23 15:03 tao12345666333

I have a new idea.

Since APISIX v3 has added the capability of gRPC-client, some optimizations have been made to the CP/DP deployment model in APISIX v3. So we can apply this model in APISIX Ingress, implement a gRPC server similar to ETCD in APISIX-Ingress-Controller, let it serve as the control plane, and APISIX, which is actually a data plane, connects to it through gRPC.

In this way, the data plane APISIX is exactly the same as normal APISIX, not in Standalone mode, so you can use all the capabilities of APISIX without any modification to APISIX.

WDYT?

look like , the apisix pull a configuretion from apisix-ingress-controller. Apisix team member's will do it , Are you sure?

maybe , I'm misunderstand the means. So can you clarify the direction of data flow?

Mar 10 '23 01:03 sober-wang

Currently, APISIX v3 already supports decouple mode. DP and CP are separate.

CP provides an etcd-like service.

In the new architecture of APISIX Ingress, we only need to let the Ingress controller assume the role of CP. APISIX at DP is only the role of DP.

Mar 10 '23 02:03 tao12345666333

Are you using standalone mode? I want to understand your use case

I'm using apisix in standalone mode to work as the ingress gateway. I don't want to use ingress, because it's only designed for http. And I don't want to deploy an etcd cluster either. Now I have a deployment for 3-10 apisix replicas, and they're configured by apisix.yaml(configmap). When I want to update the apisix.yaml, I just update the configmap in helm chart and upgrade it. After 1 min later, configmap updated in pod, and apisix could watch that change right away. By doing this, I don't need to learn the apisix-ingress crd, I don't need an etcd cluster, I follow the gitops manner, all the changes are managed by git. After reading the apisix docs, I can configure my ingress as both L4 proxy and L7 proxy

Mar 20 '23 04:03 caibirdme

discuss a scenario:

If these four situations are met:

k8s control plane works well
apps are rolling update
"ingress controller" cannot sync ingress rules (or long sync delay) 3.1 ingress controller crashloop 3.2 or their nodes down 3.3 or they cannot connect to apiserver (node network problem) 3.4 or apisix etcd down (old architecture)
k8s/apisix administrators don't notice what happened

data plane (upstream) will reference obsoleted pod ip, which leads to:

app A pod IP is recycled by cni ipam, the redundancy of app A will reduce
or app A pod has been terminated, its IP re-assigned to app B pod: app A will HTTP 404 randomly

how about dp and cp run in same pod architecture? I think it can minimized the risk (3.2, 3.3: only affect corresponding apisix dp, not all apisix dp).

Jun 06 '23 17:06 mchtech

Is there any decision on how to implement this feature? Folks on my current company are willing to spend some engineering time on this

Jun 12 '23 02:06 zhuoyang

@zhuoyang I'm glad to hear this news.

Currently, the #1803 plan is to implement an etcd-server.

In fact, there are still many things we need to do. I will write a detailed technical plan and break down the tasks as soon as possible. Hope we can work together to complete this feature.

Jul 24 '23 15:07 tao12345666333

cool! let's keep in touch

Aug 01 '23 03:08 zhuoyang

can configure my ingress as both L4 proxy and

Can please share the code somewhere how it's setup with standalone mode is done...I mean how you are refering to backend svc's in diff namespaces and passing apisix.yaml config in the deployment, as I can't find that capability in the present helmchat of apisix. Do we need to manually tweak the deploy afterwards ?

Aug 10 '23 13:08 nagidocs

Great idea! This can elevate apisix-ingress-controller to a core position.

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

But when the ingress-controller goes down, it would require re-fetching, generating, and distributing the route entries upon restart. In scenarios with a large number of ingresses, this may bring more pressure on the apiserver? Or may result in longer recovery times?

Aug 29 '23 09:08 seethedoor

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

I didn't fully understand your meaning, are you referring to the new architecture or the existing one?

Aug 31 '23 03:08 tao12345666333

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

I didn't fully understand your meaning, are you referring to the new architecture or the existing one?

The existing one, and I mean your new design would avoid this. This is the benefit.

Sep 06 '23 08:09 seethedoor

https://github.com/apache/apisix-ingress-controller/releases/tag/v1.7.0

V1.7.0 released with this feature. Thanks all!!! I will close this one.

Sep 11 '23 13:09 tao12345666333

thanks @tao12345666333 is there documentation ready around the new feature ?

$mfractal avatar$ Sep 12 '23 22:09 mfractal

@mfractal FYI https://github.com/apache/apisix-ingress-controller/blob/master/docs/en/latest/composite.md

Sep 13 '23 04:09 tao12345666333

apisix-ingress-controller apisix-ingress-controller copied to clipboard

proposal: New architecture of Apache APISIX Ingress controller

apisix-ingress-controller
apisix-ingress-controller copied to clipboard