Kmesh 2025 road map update
As the new year begins, following the release of Kmesh v1.0.0, the Kmesh community is now collecting user requirements to update its roadmap for 2025. We welcome everyone to share the specific needs and use cases they have for the Kmesh project.
The Kmesh community is kicking off the new year by gathering user requirements to update its roadmap for 2025, after the release of Kmesh v1.0.0. The community invites all users to provide their feedback and share the specific use cases and needs they have for the Kmesh project.
- Description(Like Usage Scenarios and so on):
- Expected Capabilities:
- Other:
NOTE: It is recommended to submit your project using the above template.
- Description(Like Usage Scenarios and so on): Multi-cluster support
- Expected Capabilities: Supports discovering and routing to services across multiple clusters, cross-cluster DNS, and can support multi-network topologies.
- Other: Here are some of my thoughts recorded: In 2024, we successfully ran Kmesh (dual-engine) on Alibaba Cloud Service Mesh (ASM). This indicates that Kmesh can now support the Primary-Remote model in Istio, as ASM's control plane is managed (i.e., external to the Kubernetes cluster). I am considering whether it would be feasible to evaluate Kmesh’s support for a Primary and multiple Remote cluster architecture to enable multi-cluster support. In Istio, supporting multiple clusters may involve several capabilities, generally including cross-cluster service discovery, multi-cluster DNS (with DNS proxy), and east-west communication across multiple networks. These are relatively mature for the Sidecar model, and service mesh users are typically very interested in these capabilities because a multi-cluster architecture can support popular use cases among end-users, such as disaster recovery, application deployment in multiple environments, and distributed application deployment. Currently, Ambient Mesh does not yet support multi-cluster functionality, and there is no clear schedule from the Istio community for this feature (although Solo.io's Gloo Mesh already supports multi-cluster in its commercial version). From Kmesh’s perspective, it seems we already have DNS capture and are supporting locality load balancing. I believe these capabilities could be even more valuable in a multi-cluster architecture. Additionally, supporting such an architecture could provide more attractive features to end-users.
- Description(Like Usage Scenarios and so on): Kmesh observability feature for kernel-native(ADS)
- Expected Capabilities: kmesh need to implement in both go-controller and kmesh ebpf code to achieve the observability feature
- Other: For details, see https://github.com/kmesh-net/kmesh/issues/965
- Description(Like Usage Scenarios and so on): E2E test for kernel-native
- Expected Capabilities: The kernel-native mode is not running in the current E2E test.It needs to be replenished
- Other: Without E2E tests, the quality of community code functions cannot be well maintained, and bugs may occur during update iterations.
-
Description(Like Usage Scenarios and so on):
Zero down time during upgrade if bpf map does not change
-
Expected Capabilities:
Currently during upgrade, kmesh first detach the bpf progs and maps, and then re load/attach them. So during this period, traffic management is missed.
-
Other:
- mda pkg need support observability like “tcp dump”
- Description(Like Usage Scenarios and so on): kmesh supports mtls
- Expected Capabilities: At this stage kmesh has the capability of ipsec to meet some of the needs of encrypted communication. But we still need to support mtls with users request.
- Description(Like Usage Scenarios and so on): Offload more capability execution into XDP Prog
- Expected Capabilities: Kmesh now has authz capabilities based on ip address and destination port. Missing authz capabilities based on ns, identity, etc. We plan to follow up with a patch
- Other: But this ability is difficult to implement in kernel. I think we should move forward when there is a real user requirement.
- Description(Like Usage Scenarios and so on): adapt Orion
- Expected Capabilities: Waypoint uses envoy, a mature and powerful web proxy. It also drags down performance due to its diverse capabilities. So we tried to adapt Orion, a new cloud-native agent.
- Other:
AI plugin #1118
- Description(Like Usage Scenarios and so on): realtime network flow observablity(L4 and L7 informations)
- Expected capablities: kmesh provides prometheus metrics which is not realtime and only contains numerical based information, we need eBPF-based flow capture, and live streaming of the network flow information, also we can create a ui for showing flow info, like wireshark
- More info: #1282
@yp969803 Can you elaborate a little bit, maybe creating a new issue