javascript
javascript copied to clipboard
ListWatch initial list performance is slow
Describe the bug
When a list watch is created (or any other time listWatch.resourceVersion === ''), the ListWatch will attempt to bootstrap its list of objects. Insertion of items from the API Server's List response into ListWatch.objects is n^2, and can be very slow for very large quantities of objects (on the order of 100K).
In particular:
ListWatch.doneHandlerinvokesListWatch.addOrUpdateItemswith all new items. codeListWatch.addOrUpdateItemsiterates over all items returned from the API Server and callsaddOrUpdateObjectfor each item. codeaddOrUpdateObjectcallsfindKubernetesObjectwhich callsobjects.findIndexwhich again goes over all items. code
** Client Version **
1.0.0-rc6
** Server Version **
1.29.8 (Although I don't think this is relevant)
To Reproduce
I added a shim into cache.js in the @kubernetes/client-node package to keep track of how many times a object in cache was accessed via findKubernetesObject.
function findKubernetesObject(objects, obj) {
return objects.findIndex((elt) => {
++global.accesses; // This is the only addition
return isSameObject(elt, obj);
});
}
test.mjs
import * as k8s from '@kubernetes/client-node';
const apiUrl = '/apis/v1/namespaces/default/configmaps';
global.accesses = 0;
const kc = new k8s.KubeConfig(); kc.loadFromDefault();
const watch = new k8s.Watch(kc);
const client = kc.makeApiClient(k8s.CoreV1Api);
const lw = new k8s.ListWatch(
apiUrl,
watch,
async () => {
const x = await client.listNamespacedConfigMap({namespace: 'default'});
console.log('fetched items', x.items.length);
return x;
},
false
);
const start = performance.now();
await lw.start()
console.log('elapsed', performance.now() - start);
console.log('object accesses', accesses)
Test context
$ minikube start
...
$ seq 50000 | xargs -P 100 -I'{}' kubectl create cm cm-'{}'
...
$ node test.mjs
fetched items 50004
elapsed 12140.640042
object accesses 2500350012
Expected behavior It seems reasonable that insertion is O(n) if possible.
** Example Code** N/A
Environment (please complete the following information):
- OS: Linux
- NodeJS Version: 20
- Cloud runtime: EKS + Minikube
We're happy to take performance improvements (and benchmarks) for this code. Please send PRs.
Also, if you don't want/need to build the entire cache, you can also specify a resource version, and start from there.
Unfortunately, I do need to build the entire cache. I'll try and send through a PR sometime this week.
@uint0 I started to dig into this, this was fixed in the 0.x client series:
https://github.com/kubernetes-client/javascript/pull/780
So you could try that client. I will try to cherry-pick those improvements into the 1.x branch.
I ported this to the 1.x branch here: https://github.com/kubernetes-client/javascript/pull/1995
If you want, you can patch and try that.
@uint0 we've merged this but it' not yet released.
You could try latest release-1.x branch or patch this manually with a cherry-pick for example.
I'll leave that issue open for a while so you can give feedback in case it doesn't fix your problem.
This looks awesome!
Our total elapsed time now is effectively the time needed for the network list call. The actual insertion is going from minutes to ~30ms for up to 100000 items :)
Thanks so much, and sorry I couldn't get around to making a PR 😅.
Timings
(These ignore network time and average over 50 calls unlike the original script)| Items | Time (original) ms | Time (updated) ms |
|---|---|---|
| 1000 | 7 | 13 |
| 5000 | 93 | 11 |
| 10000 | 294 | 15 |
| 20000 | 880 | 18 |
| 50000 | 6077 | 25 |
| 80000 | 16944 | 32 |
| 100000 | 28524 | 37 |