glaredb icon indicating copy to clipboard operation
glaredb copied to clipboard

feat: jq function

Open tychoish opened this issue 2 years ago • 1 comments

This is an experiment/curiosity. I don't think we should should neccessarily merge it as is. Questions that remain:

  • should we mirror the kdl_match() and kdl_select() functions, or is jq weird/known enough that it shouldn't matter.

  • This is going to add a build-time dependency of gcc/autotools to compile the bundled C bits. This was true on my system.

    See this bit the readme:

    Using this feature requires having autotools and gcc in PATH in order for the to build to work.

    I suspect that clang/llvm bits would work as well as gcc, but I haven't dug in.

  • I put this behind a feature flag with it disabled by default, and boy is that annoying to develop. 😂 This does, however, make it much easier to turn off.

  • I chose somewhat arbitrary values for the mnemonic cache for the compiled queries. (and also applied the changes to the KDL functions there.) This caches lets us amortize the cost of compiling the JQ program when the function is called often.

  • I haven't thought about the implications of running this in a multi-tenant environment, and this is tricky (jq has a module system that means you can use it to read data data off the filesystem (only json and jq files, only things that are paresable/etc.)) The other security concerns are probably limited, but definitely want to think harder about this.

In general the pros are:

  • this is real jq, no substitutions, no impersonators.

  • jq itself handles all of the json processing, so you don't have to parse the json and then pass it through the filter, (and write it back to a string?)

The cons:

  • the module (import/include) subsystem is hard to turn off and is unflagged within jq at the moment.

  • jq-rs is behind the latest version of jq, and while I'm watching and poking the repo, this might be a bit of a liability in the long run..

  • extra build dependencies (though nothing exciting).

Thoughts on jaq after working on this more:

  • appealing because jaq doesn't support modules.

  • pure rust 🤷

  • we can support both, in theory [as I wrote this I realized that this actually fixes the module problem because modules would be a parse error for jaq, so we just parse the query twice.]

  • the serialization dance with jaq is pretty annoying.

tychoish avatar Dec 24 '23 04:12 tychoish

When we can get dynamic loading of functions (including having ways of not loading some functions in cloud, this could be an interesting thing to land.)

tychoish avatar Feb 28 '24 20:02 tychoish

superseded by #3091 (for security reasons)

tychoish avatar Jul 17 '24 17:07 tychoish