kapitan
kapitan copied to clipboard
replace jsonnet with go-jsonnet
can you explain about this problem statement?
@mithil2311 We are currently using the direct Python library which uses the C++ version of jsonnet. To increase the speed of jsonnet compilation, we need to move to the Go version https://github.com/google/go-jsonnet and create Python bindings for it.
(Jsonnet dev here)
FYI basic C bindings for go-jsonnet are probably going to be merged soon (PR: https://github.com/google/go-jsonnet/pull/257). The main missing part is the support for native functions. Other than that it's a drop-in replacement.
My understanding is that the Python bindings just use C bindings internally, so no changes outside of setup.py
should be necessary.
HI @sbarzowski,
Please excuse my lack of understanding. Could you kindly explain what you mean by
the main missing part is the support for native functions
Sure. I meant that this function from the libjsonnet.h is not implemented yet in the bindings:
/** Register a native extension.
*
* This will appear in Jsonnet as a function type and can be accessed from std.nativeExt("foo").
*
* DO NOT register native callbacks with side-effects! Jsonnet is a lazy functional language and
* will call your function when you least expect it, more times than you expect, or not at all.
*
* \param vm The vm.
* \param name The name of the function as visible to Jsonnet code, e.g. "foo".
* \param cb The PURE function that implements the behavior you want.
* \param ctx User pointer, stash non-global state you need here.
* \param params NULL-terminated array of the names of the params. Must be valid identifiers.
*/
void jsonnet_native_callback(struct JsonnetVm *vm, const char *name, JsonnetNativeCallback *cb,
void *ctx, const char *const *params);
Native extensions are a feature which allows calling your own C function from Jsonnet - I can see that Kapitan uses it for a few functions (https://github.com/deepmind/kapitan/blob/master/kapitan/lib/kapitan.libjsonnet).
The jsonnet_json_*
family of functions (which is only useful for native functions) is also not implemented in the bindings.
The other missing parts unrelated to native functions are evaluation variants *_stream
and *_multi
and import callbacks (jsonnet_import_callback
function).
Just to be clear, we definitely intend to have all of these implemented on go-jsonnet
side so that it becomes a full drop-in replacement for libjsonnet (that is for evaluation, not code reformatting). Any help is ofc welcome and will significantly speed things up.
@sbarzowski Thanks for the detailed explanation! I now see what the challenges are. If I have time to work on this issue I'll try my best to help.
@yoshi-1224 Can I take this issue for SoC if you aren't planning to work on it full time?
I have some knowledge of golang that might help in implemention on go-jsonnet
.
Hi @vaibhavk Of course! I'm not submitting a proposal for this, but even if I were there's no need to ask because it's not first come first served.
These have now been fixed https://github.com/google/go-jsonnet/issues/322 https://github.com/google/go-jsonnet/issues/323
I wonder, how you are going to use go-jsonnet
shared library:
And if you are going to use the second approach, should we wait for go-jsonnet
to add this code for python
or we are going to add it in this repository?
@alldroll ctypes should work just fine - however the second approach is ideal, in which case, I agree that that it makes sense to add it to go-jsonnet
Hello, is there any progress on this front? The optimisations of go-jsonnet are badly needed in some contexts.
Progress is being made and some bugs and issues are being found along the way. Unfortunately at the moment I don't have a fully working version of it yet.
On Wed, Apr 8, 2020 at 5:18 PM francoisj [email protected] wrote:
Hello, is there any progress on this front? The optimisations of go-jsonnet are badly needed in some contexts.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/deepmind/kapitan/issues/144#issuecomment-611053527, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEFAADOUJOXLZSEW6AOAJLRLSPWRANCNFSM4FS5L3WQ .
if needed we can help. If you have some work in progress somewhere we can take a look at it.
@ramaro We are trying to integrate gojsonnet with Kapitan, as we need it to improve our compile time (our usage of Kapitan is growing every weeks). Our work is available here.
After trying GoJsonnet bindings in python, I found no issues with it. But, when used with Kapitan, it seems to have some sort of "lock" (can be easily tested using the commit https://github.com/radiofrance/kapitan/commit/c718cd50828e88470425ad3afbfa9ad34f126328 with the files in DRAFT-GOJSONNET
).
After some research, I deduce that these "deadlocks" are due to the way python handles multiprocessing by using fork
. These "locks" disappear when I run multiprocessing with the spawn
context, but introduces another problem; global variables are not copied (because the spawn
mode creates an empty process).
Unfortunately, I didn't have time to do more investigation and I'm not sure if I'm on the right way (I'm begining in Python).
EDIT: I am a teammate of @joulaud
Hey @xunleii great to hear you're using Kapitan!
After some research, I deduce that these "deadlocks" are due to the way python handles multiprocessing by using fork. These "locks" disappear when I run multiprocessing with the spawn context, but introduces another problem; global variables are not copied (because the spawn mode creates an empty process).
That pretty much sums my findings and I started working improving multiprocessing but ran into exactly the same issues, but unfortunately haven't finalised it yet. I will have a look at your work. Thanks
Could it have something to do with this: https://github.com/google/jsonnet/pull/814?
(It's not yet ported to go-jsonnet.)
GIL handling was ported to go-jsonnet in last release cf. https://github.com/google/go-jsonnet/pull/474 and https://github.com/google/jsonnet/releases/tag/v0.17.0
Yes. Unfortunately, it seems we still have some elusive deadlock issues: https://github.com/google/go-jsonnet/issues/484. I wasn't able to dig deeply enough into it yet.
If anyone can share an idea about what that might be about, it will be really appreciated!
When I took a stack-trace on the deadlock produced by @xunleii test-case above I find the exact same stack than in go-jsonnet ticket including Go runtime.gcStart
, which obviously must wait for some other goroutine but there is only one thread in the affected process.
After digging a little I was arriving to the same conclusion as @githubaccount888 on https://github.com/google/go-jsonnet/issues/484#issuecomment-744046277. Go runtime is multithreaded by construction and have a concurrent mark-and-sweep GC. And trying to use multithread with python multiprocessing fork method instead of spawn being a pool full of shark I think that not-withstanding any improvement on go-jsonnet binding we must make kapitan work with the spawn method to have some chance to make it working reliably.
What are the real blockers to make this work? @xunleii you say there is some missing global variables in the spawn cases. Do you know what they are? Would it be difficult to add them to the "worker" closure created by "functools.partial" at https://github.com/deepmind/kapitan/blob/master/kapitan/targets.py#L101-L108 ?
EDIT: typo s/being of/being a/
Unfortunately, I need to investigate another time to found them.
I tried to advance on this once again. By using the "spawn" method of multiprocessing I stumble on problems with callbacks I did not investigate further.
I just bypassed the whole multiprocessing thing in order to test with
diff --git a/kapitan/targets.py b/kapitan/targets.py
index bb952b0..9d3adbd 100644
--- a/kapitan/targets.py
+++ b/kapitan/targets.py
@@ -123,7 +125,9 @@ def compile_targets(
# compile_target() returns None on success
# so p is only not None when raising an exception
- [p.get() for p in pool.imap_unordered(worker, target_objs) if p]
+ for x in target_objs:
+ worker(x)
os.makedirs(compile_path, exist_ok=True)
and it mostly works with a lot of performance improvements.
On my test case target compilation itself is around 10 to 25 times faster with go-jsonnet than with cpp-jsonnet (my typical target takes around 5s to 20s with cpp-jsonnet and arount 0.10s to 0.80s with go-jsonnet).
This means than even disabling multiprocessing when launching kapitan compile
on my 55 targets, I am better off with go-jsonnet and without multiprocessing than before.
- cpp-jsonnet and multiprocessing:
228,19s user 1,95s system 337% cpu 1:08,28 total
- go-jsonnet no multiprocessing:
53,92s user 0,95s system 112% cpu 48,861 total
This is not the case on single target compile as there seems to be some heavier startup time.
Note: I have problem with the yaml_load
callback and I just disabled everything relying on it on my tests, I still must investigate on those callback problems.
Hey @joulaud happy new year!
I have very similar observations - disabling multiprocessing for compilation is actually faster and having it enabled hangs compilation when using go-jsonnet
. I have yet to assess this better, but it feels to me that a future without multiprocessing (at least for compilation) will be much simpler to develop with.
I have committed my latest changes for go-jsonnet support at https://github.com/ramaro/kapitan/tree/gojsonnet
It disables multiprocessing as you suggest but also adds caching for go-jsonnet import paths and a new component to simply test the yaml_load()
callback. Tests pass on Linux running Python 3.8.
Let me know if this works for you!
Hi @ramaro!
Thank you very much for this New Years gift. I have been using it for a week and it works pretty well. The compilation works much better than before (16s to 0.50s on average), without making any changes after compilation. ~Unfortunately on larger targets I didn't see huge differences but it's probably due to the Jsonnet code which can be optimized on my end (or a misconfiguration of which Kapitan I use)~.
EDIT: It's a misconfiguration on my side (I have both kapitan and for the largest one, I've used the "old" kapitan)
From all the team and @joulaud, please accept our warmest thanks @ramaro !
I repeat the thanks to @ramaro This change is a life-saver for our daily workflow and we just migrated fully to your branch.
What would be the next step to make this branch in state of being mergeable for next release of kapitan? Can we help in anything?
@joulaud @uZer @xunleii hey apologies, I haven't been able to look at this further. I did spent sometime a few weeks ago on this and on my current infra (~140 jsonnet targets), kapitan with go-jsonnet seems to run out of memory. So while I'm not sure go-jsonnet is ready for primetime yet (more debugging pending, could be an issue with Kapitan), maybe what we can do is have kapitan load go-jsonnet instead of jsonnet if the go-jsonnet module is present. So you could use it and help figure out any issues. How does that sound?
If you come across anything that can be reproduced with go-jsonnet (either the command or the Python bindings), please let me know!
@joulaud @uZer @xunleii hey apologies, I haven't been able to look at this further. I did spent sometime a few weeks ago on this and on my current infra (~140 jsonnet targets), kapitan with go-jsonnet seems to run out of memory.
I did just try to compile 25 full kube-prometheus stacks and didn't notice any unbound increase in memory usage. The memory usage climb around 500MB in 4 or 5 runs and then remain at this level until end of compilation. This is with golang 1.16 and default gojsonnet release (pip install gojsonnet).
maybe what we can do is have kapitan load go-jsonnet instead of jsonnet if the go-jsonnet module is present. So you could use it and help figure out any issues. How does that sound?
That would be a great way to let more people experiment with it.
@ramaro It would be great to get the option to use go-jsonnet. I'm currently in the progress of adding kube-prometheus to our stack to replace the old prometheus install, but I'm not to happy about the 2 minute extra compile time. I tested it and it solves my issue at least. Can I help you by forking your branch and make it automatically import gojsonnet over jsonnet if available so we can get it merged and release?