karmada icon indicating copy to clipboard operation
karmada copied to clipboard

Support concurrent MultipleClusterCache in karmada-search

Open NickYadance opened this issue 11 months ago • 3 comments

What would you like to be added:

An optional concurrent MultipleClusterCache to speed up the list request for karmada-search.

Why is this needed:

The cacher of karmada-search will delegate the list request to the underlying storage in the following conditions:

https://github.com/karmada-io/karmada/blob/e7300c3c1850629463af95f54f31a032a1890ce1/vendor/k8s.io/apiserver/pkg/storage/cacher/cacher.go#L724-L741

The list request issued by kubectl, or client-go will most likely fall into these conditions, as they usually apply the default list options.

In this case the list request will have to go through all member clusters one by one. This can be rather slow when some member clusters sit in different IDCs from the karmada-search, 4-8 seconds in our user case. Add goroutines to the for-loop to visit the cluster concurrently will bring better time performance in both caching or delegating cases.

https://github.com/karmada-io/karmada/blob/e7300c3c1850629463af95f54f31a032a1890ce1/pkg/search/proxy/store/multi_cluster_cache.go#L264-L269

NickYadance avatar Mar 06 '24 07:03 NickYadance

Ask @ikaven1024 to help take a look. /cc @ikaven1024

XiShanYongYe-Chang avatar Mar 06 '24 07:03 XiShanYongYe-Chang

At present, listing from clusters one by one indeed takes more time. But some problems block me improving it with parellelly visit. For example, client requests listing withlimit=500. Then how many clusters shall we visit? Obviously not every clusters. So before it, we shall estimate the mount in every clusters. Cluster cache may help it, but not all, imaging cluster cache is not ready.

ikaven1024 avatar Apr 10 '24 10:04 ikaven1024

Paging request has to traverse clusters in order, thus cannot be done in parallel by simple adding goroutines. Non-paging(limit=0) can be parallel but it will be confusing as that the non-paging request can most possibly be much faster than paging one.

The estimation should be helpful in most cases, and maybe easier to implement ? idk. Another possible way is to spread the paging request across clusters like downloading in parallel, and keep record of each downloading piece.

/api/v1/pods?limit=500
-->
  /api/v1/pods?limit=100 -> member1
  /api/v1/pods?limit=100 -> member2
  /api/v1/pods?limit=100 -> member3
  /api/v1/pods?limit=100 -> member4
  /api/v1/pods?limit=100 -> member5

NickYadance avatar Apr 11 '24 03:04 NickYadance