telemetry
telemetry copied to clipboard
Use fastjsonschema for json schema validation
Add default option to use fastjsonschema to validate events. fastjsonschema uses code gen and thus should speed up validation since we don't need to traverse the schema again every time we validate an event, and the event json traversal/validation would be a lot faster too. Will add benchmarks soon.
This is relevant for #59 where there is performance concern about running the validation twice.
Inspired by https://github.com/jupyter/nbformat/blob/dd4725fb41515c69bc76e1c7f32f3ede0075725f/nbformat/json_compat.py
@kiendang, are you still working on this PR? This looks good to me. Great work here!
@kiendang, are you still working on this PR? This looks good to me. Great work here!
~Thanks, I'm done. Just that one comment.~
I'm done for now except for that one comment. But I converted to draft again since I want to think a bit about how this integrates with the category filtering, especially in the future where draft-2019/draft-2020 is supported.
Great. Ping me here when you’re ready for a final review.
Sorry for not getting back on this sooner.
jsonschema
just added support for draft 2020-12 Julian/jsonschema#817. However I'm not sure if they have error formatting/annotation collection which is what we need.
I'm less enthusiastic about fastjsonschema
now since it looks like the library is not being regularly maintained now, and if jsonschema
has what we need then it's not necessary.
As mentioned in #66 jschon seems to have exactly what we need (draft 2020-12 and annotation collection), but it is a new library.
My proposal is we defer making decision on this for a bit longer since this is "only" a performance thing. Let me look into jsonschema
more. Meanwhile I'll figure out #61, after that we can cut a new release and merge jupyter-server/jupyter_server#364.
Thanks, @kiendang. How is it going with #61? I'm diving back into this and would love to see telemetry land in jupyter-server soon. I'm happy to help with anything.
I do have a WIP on that. Will find time to finish it somewhere next week. I'd love to see telemetry integrated in too.
I wanted to chime in and say that another reason to be interested in fastjsonschema is that it removes a C dependency! jsonschema now depends on https://pypi.org/project/pyrsistent/ which is not pure python, and since nbformat and friends use it the jupyter stack is not easily installable on pyiodide now