kubeyaml
kubeyaml copied to clipboard
PyYAML + LibYAML Bindings 15x faster?
Related to https://github.com/fluxcd/flux/issues/1857 I was looking into kubeyaml
.
One thing I'm aware of is the optional use of underlying C bindings with Python YAML Parsers.
From what I gather, the C extension is not available for the round-trip functionality of ruamel.yaml
(which is used as the default typ
arg) - so I don't think kubeyaml
is making use of this...
In terms of timing, I parsed all our manifests through both versions (360 manifests, merged into a 1.8mb file).
time cat input.yaml \
| python kubeyaml.py image \
--image="registry.gitlab.com/<my-site>:test" \
--container="main" --kind=Deployment \
--name dev-site \
--namespace=default > output.yaml
Existing
0.00s user 0.01s system 0% cpu 18.336 total
PyYAML + CSafeLoader / CSafeDumper
1.07s user 0.04s system 94% cpu 1.178 total
The output from PyYAML has single-quotes in place of double-quotes, and seems to differ on the escaping of newline (I think the end-result is ok though, will check).
In terms of my code changes, it was:
import yaml
...
docs = yaml.load_all(infile, Loader=yaml.CSafeLoader)
yaml.dump_all(fn(docs), outfile, Dumper=yaml.CSafeDumper)
For this experiment, I also removed the existing yaml()
, but I think preserving of single-quotes is supported by PyYAML (just not via a nice booelan flag).
I may have missed something - I'll try to apply the result to our cluster and see...just thought it was a crazy time difference worth getting your feedback on :)
Note, I've not tried this with fluxcd yet...