very slow apply performance due to openapi schema reparsing
What happened: kubectl apply is very slow in reading manifest files. It appears to be the case that kubectl rereads its configuration several times per manifest it is supposed to apply which gets extremely slow, several minutes when applying large amount of manifests.
How to reproduce it (as minimally and precisely as possible):
$ cat kubeconfig
apiVersion: v1
clusters:
- cluster:
certificate-authority: /tmp/cluster.pem
server: https://cluster.url
name: cluster
contexts:
- context:
cluster: cluster
user: user
name: cluster-user
current-context: cluster-user
kind: Config
preferences: {}
users:
- name: user
user:
token: ...
for i in {1..10}; do echo '{ "apiVersion": "v1", "kind": "Namespace", "metadata": { "name": "ns'$i'" }}' ; done > manifests
$ cat manifests
{ "apiVersion": "v1", "kind": "Namespace", "metadata": { "name": "ns1" }}
{ "apiVersion": "v1", "kind": "Namespace", "metadata": { "name": "ns2" }}
{ "apiVersion": "v1", "kind": "Namespace", "metadata": { "name": "ns3" }}
...
Now run following dry-run kubectl and observe how often it opens (and reads) the certificate-authority file cluster.pem:
$ kubectl version
Client Version: v1.31.0
Kustomize Version: v5.4.2
$ for i in {1..10}; do echo '{ "apiVersion": "v1", "kind": "Namespace", "metadata": { "name": "ns'$i'" }}' ; done > manifests
$ KUBECONFIG=config GOMAXPROCS=2 strace -f -e openat apply --dry-run=client -f manifests|& grep cluster.pem -c
27
# run in 1 second
$ for i in {1..100}; do echo '{ "apiVersion": "v1", "kind": "Namespace", "metadata": { "name": "ns'$i'" }}' ; done > manifests
$ KUBECONFIG=config GOMAXPROCS=2 strace -f -e openat apply --dry-run=client -f manifests|& grep cluster.pem -c
207
# run in 12 seconds
Increasing the number of objects in the manifests file increase the times it reads the cluster certificate from its configuration which does not seem necessary. edit: see below, the real problem is repeated json decoding of the the openapi schema documents
When applying more manifests the time before kubectl even contacts the server increases to several minutes which can be very relevant for e.g. CI test runs.
The problem should not be the the manifest parsing time, as e.g. python yaml or the golang yq version can parse these manifests in a fraction of the time it takes kubectl.
What about passing --dry-run=server or using --server-side?, the same situation occurs in server-side apply as well?.
yes, the problem occurs before it even starts talking to the server regarding the manifests (and after it has setup its resource caches from the servers api-resources)
There might be a latency in discovery endpoint. Could you please run by increasing the log verbosity (-v=9) and check that exactly which endpoint is so slow?.
it is not an endpoint that is slow, it is spending all its team reading the certificate authority over and over again at 100%-150% cpu usage
here is the output with network log:
$ kubectl apply --dry-run=client -f manifests -v 6 |& ts
Dez 05 13:31:33 I1205 13:31:33.126524 397199 loader.go:395] Config loaded from file: config
Dez 05 13:31:33 I1205 13:31:33.159773 397199 round_trippers.go:553] GET server/openapi/v3?timeout=32s 200 OK in 30 milliseconds
Dez 05 13:31:33 I1205 13:31:33.191357 397199 round_trippers.go:553] GET server/openapi/v3/api/v1?hash=......&timeout=32s 200 OK in 25 milliseconds
<<< 11 seconds of reading cluster.pem >>
Dez 05 13:31:44 I1205 13:31:44.831916 397199 round_trippers.go:553] GET .../api/v1/namespaces/ns1 404 Not Found in 6 milliseconds
if you give me some pointers how to create a cpu profile I can pinpoint the exact cause for you, I'm not very familiar with golang tooling.
Reading the cluster.pem shouldn't take 11 seconds. I'd recommend checking the output by running highest verbosity (i.e. -v=99).
please read my full report, reading the cluster pem does not take 11 seconds, reading the cluster pem hundreds of times (it seems twice per input manifest) and the other code associated with loading the kubeconfig does.
adding -v99 does not change the output as you do not log anything for that part of the code
Dez 05 13:40:14 I1205 13:40:14.767803 397762 cached_discovery.go:77] returning cached discovery info from .../crd.projectcalico.org/v1/serverresources.json
<<< 11 seconds of reading clsuter.pem +whatver else kubectl does with the config >>
Dez 05 13:40:24 I1205 13:40:24.913038 397762 round_trippers.go:466] curl -v -XGET -H "Accept: application/json" -H "User-Agent: kubectl/v1.30.6 (linux/amd64) kubernetes/00f20d4" 'servernamespaces/ns1'
using --validate=false removes the performance problem
Validation interacts with the API server as in here. I think, there is an issue on your cluster. What is your cluster version?
/remove-kind bug /kind support
the cluster version is 1.30.6 but it is not a cluster problem, there is no network traffic at all during this part. have you tried running my reproducer yourself?
I checked how to profile go and the parsing of the cluster.pem is not the main problem (although it is very odd this happens) but in decoding of the 1.3mib large openapi schema which appears to be done for every single input manifest again
#6 0x00000000021caade in k8s.io/client-go/openapi3.(*root).GVSpec (r=0xc00061b9d0, gv=..., ~r0=0x0, ~r1=...)
at /kubernetes/kubernetes/staging/src/k8s.io/client-go/openapi3/root.go:102
102 err = json.Unmarshal(openAPISchemaBytes, &parsedV3Schema)
(gdb) p openAPISchemaBytes
$2 = {
array = 0xc00114a000 "{\"openapi\":\"3.0.0\",\"info\":{\"title\":\"Kubernetes\",\"version\":\"v1.30.6\"},\"paths\":{\"/api/v1/\":{\"get\":{\"tags\":[\"core_v1\"],\"description\":\"get available resources\",\"operationId\":\"getCoreV1APIResources\",\"respo"..., len = 1383492, cap = 1400832}
...
#7 0x00000000021caade in k8s.io/client-go/openapi3.(*root).GVSpec (r=0xc0002be620, gv=..., ~r0=0x0, ~r1=...)
at /kubernetes/kubernetes/staging/src/k8s.io/client-go/openapi3/root.go:102
102 err = json.Unmarshal(openAPISchemaBytes, &parsedV3Schema)
(gdb) break
Breakpoint 1 at 0x21caade: file /kubernetes/kubernetes/staging/src/k8s.io/client-go/openapi3/root.go, line 102.
(gdb) ignore 1 100
Will ignore next 100 crossings of breakpoint 1.
(gdb) c
...
(gdb) info break
Num Type Disp Enb Address What
1 breakpoint keep y 0x00000000021caade kubernetes/kubernetes/staging/src/k8s.io/client-go/openapi3/root.go:102
breakpoint already hit 42 times
yes, openapi schema validation would take significant amount of time to validate the resource. You can disable validation as you did above (that is highly not recommended). But I don't think there is anything we can do with respect to this. Validation should be done and it, unfortunately, takes time.
why does the openapi schema need to be decoded for every single manifest? I can load the schema with python and validate all the manifests it in fractions of a second, not multiple seconds to minutes like kubectl takes here.
proof of concept patch just caching the parsed schemas:
--- a/staging/src/k8s.io/client-go/openapi3/root.go
+++ b/staging/src/k8s.io/client-go/openapi3/root.go
@@ -59,6 +59,7 @@ type Root interface {
type root struct {
// OpenAPI client to retrieve the OpenAPI V3 documents.
client openapi.Client
+ schemas map[string]*spec3.OpenAPI
}
// Validate root implements the Root interface.
@@ -67,7 +68,7 @@ var _ Root = &root{}
// NewRoot returns a structure implementing the Root interface,
// created with the passed rest client.
func NewRoot(client openapi.Client) Root {
- return &root{client: client}
+ return &root{client: client, schemas: make(map[string]*spec3.OpenAPI)}
}
func (r *root) GroupVersions() ([]schema.GroupVersion, error) {
@@ -93,6 +94,12 @@ func (r *root) GroupVersions() ([]schema.GroupVersion, error) {
}
func (r *root) GVSpec(gv schema.GroupVersion) (*spec3.OpenAPI, error) {
+ apiPath := gvToAPIPath(gv)
+ schema, found := r.schemas[apiPath]
+ if found {
+ return schema, nil
+ }
+
openAPISchemaBytes, err := r.retrieveGVBytes(gv)
if err != nil {
return nil, err
@@ -100,6 +107,7 @@ func (r *root) GVSpec(gv schema.GroupVersion) (*spec3.OpenAPI, error) {
// Unmarshal the downloaded Group/Version bytes into the spec3.OpenAPI struct.
var parsedV3Schema spec3.OpenAPI
err = json.Unmarshal(openAPISchemaBytes, &parsedV3Schema)
+ r.schemas[apiPath] = &parsedV3Schema
return &parsedV3Schema, err
}
makes it 10 times faster:
/tmp$ for i in {1..100}; do echo '{ "apiVersion": "v1", "kind": "Namespace", "metadata": { "name": "ns'$i'" }}' ; done > manifests
/tmp$ time ./kubectl.patched --profile=cpu apply --dry-run=client -f manifests > /dev/null
real 0m1.067s
user 0m0.546s
sys 0m0.103s
/tmp$ time ./kubectl.orig --profile=cpu apply --dry-run=client -f manifests > /dev/null
real 0m9.366s
user 0m11.569s
sys 0m0.395s
I don't see an obvious way to make kubectl reuse the decoded schemas for multiple resource visits, though the client-go openapi class already does cache the json document in bytes form. https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/openapi/cached/groupversion.go#L47
So a possible solution would be to adjust client-go to cache the decoded json instead of or in addition and the kubectl issue would automatically be resolved. I assume that would be an issue to be filed in the main kubernetes project?
cache the decoded json
A cache within one invocation of kubectl does sound useful. You could change this issue from a bug report into a feature request @juliantaylor.
As an early idea, I might use a mixture of using CBOR for the discovery cache and a small in-memory LRU cache of parsed schemas. Bear in mind that we don't know how big the response from the Discovery API is.
Personally, I'd file the issue against kubectl as it's kubectl you want to enhance, but I can see the case for making a generic client-go solution.
Based on this thread it sounds like a reasonable feature request, I think caching this in kubectl makes the most sense unless we can get some consensus from downstream users of client-go (including ourselves) on how to accomplish this, if it's desirable, what the default behavior should be, ie. opt in to having the cache exist or opt out of it.
It does seem very odd to me that kubectl would need to re-request the same openapi spec from the server over and over since kubectl can't out of the box make requests to multiple clusters at once, so nothing should change.
/kind feature /triage accepted
You can do caching by GVK in kubectl, in the factory, for example. I am not sure you can get better results without getting into client-go, constructors don't let you wrap what needs to be wrapped. I am also not sure this is feasible to be done in the factory globally, don't know the codebase that well, but you could pull the init logic from the factory into just apply with some copy-paste:
diff --git a/staging/src/k8s.io/kubectl/pkg/cmd/util/factory_cachin_verifier.go b/staging/src/k8s.io/kubectl/pkg/cmd/util/factory_cachin_verifier.go
new file mode 100644
index 00000000000..da19f4615b0
--- /dev/null
+++ b/staging/src/k8s.io/kubectl/pkg/cmd/util/factory_cachin_verifier.go
@@ -0,0 +1,27 @@
+package util
+
+import (
+ "k8s.io/apimachinery/pkg/runtime/schema"
+ "k8s.io/cli-runtime/pkg/resource"
+)
+
+type cachingVerifier struct {
+ cache map[schema.GroupVersionKind]error
+ next resource.Verifier
+}
+
+func newCachingVerifier(next resource.Verifier) *cachingVerifier {
+ return &cachingVerifier{
+ cache: make(map[schema.GroupVersionKind]error),
+ next: next,
+ }
+}
+
+func (cv *cachingVerifier) HasSupport(gvk schema.GroupVersionKind) error {
+ if err, ok := cv.cache[gvk]; ok {
+ return err
+ }
+ err := cv.next.HasSupport(gvk)
+ cv.cache[gvk] = err
+ return err
+}
diff --git a/staging/src/k8s.io/kubectl/pkg/cmd/util/factory_client_access.go b/staging/src/k8s.io/kubectl/pkg/cmd/util/factory_client_access.go
index 6a1646b844e..c7b9abf8fbc 100644
--- a/staging/src/k8s.io/kubectl/pkg/cmd/util/factory_client_access.go
+++ b/staging/src/k8s.io/kubectl/pkg/cmd/util/factory_client_access.go
@@ -175,7 +175,7 @@ func (f *factoryImpl) Validator(validationDirective string) (validation.Schema,
primary := resource.NewQueryParamVerifierV3(dynamicClient, oapiV3Client, queryParam)
secondary := resource.NewQueryParamVerifier(dynamicClient, f.openAPIGetter(), queryParam)
fallback := resource.NewFallbackQueryParamVerifier(primary, secondary)
- return validation.NewParamVerifyingSchema(schema, fallback, string(validationDirective)), nil
+ return validation.NewParamVerifyingSchema(schema, newCachingVerifier(fallback), string(validationDirective)), nil
}
nice, quickly tried the patch on a somewhat recent ref (9ba7dcecc39274f3acf4064d572996ecdbc96f10) and performance about 20-30 times faster on the cpu, equivalent to the poc cache patch I provided earlier:
$ for i in {1..100}; do echo '{ "apiVersion": "v1", "kind": "Namespace", "metadata": { "name": "ns'$i'" }}' ; done > manifests
$ time _output/local/bin/linux/amd64/kubectl.orig apply --dry-run=client -f manifests > /dev/null
real 0m3.562s
user 0m4.144s
sys 0m0.163s
$ time _output/local/bin/linux/amd64/kubectl.patched apply --dry-run=client -f manifests > /dev/null
real 0m0.652s
user 0m0.159s
sys 0m0.044s