materialize icon indicating copy to clipboard operation
materialize copied to clipboard

testdrive: Integrate the jq library for inspecting JSONs

Open philip-stoev opened this issue 3 years ago • 8 comments

Motivation

Tips for reviewer

Testing

  • [ ] This PR has adequate test coverage / QA involvement has been duly considered.

Release notes

This PR includes the following user-facing behavior changes:

philip-stoev avatar Jun 23 '22 13:06 philip-stoev

I made changes to the Dockerfile of the ci-builder, but now compilation from within the container fails as follows:

  = note: x86_64-unknown-linux-gnu-ld.lld: error: undefined symbol: __libc_csu_fini
          >>> referenced by start.S:101 (../sysdeps/x86_64/start.S:101)
          >>>               /opt/x-tools/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/sysroot/usr/lib/../lib64/Scrt1.o:(_start)
          
          x86_64-unknown-linux-gnu-ld.lld: error: undefined symbol: __libc_csu_init
          >>> referenced by start.S:102 (../sysdeps/x86_64/start.S:102)
          >>>               /opt/x-tools/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/sysroot/usr/lib/../lib64/Scrt1.o:(_start)
          collect2: error: ld returned 1 exit status

Which is some gcc/glibc problem.

philip-stoev avatar Jun 27 '22 08:06 philip-stoev

Neat! As a rule we avoid introducing dependencies on system libraries—so rather than modifying the ci-builder to include libjq, enable the bundled feature of jq-sys. That way developers only need a C/C++ compiler (which ofc they already have) and will compile libjq on the fly. Hopefully that fixes your compilation error, too.

benesch avatar Jul 04 '22 04:07 benesch

@benesch . I did as you suggested and now the only external dependency required is "libtool". Is that acceptable? Should I add it to doc/developer/guide.md and ci/builder/Dockerfile ?

Without libtool present, one gets:

   Compiling http-serde v1.1.2
error: failed to run custom build command for `jq-sys v0.2.2`

Caused by:
  process didn't exit successfully: `/home/pstoev/philip-stoev-testdrive-json-jq/target/release/build/jq-sys-b2faca83ffe2a11d/build-script-build` (exit status: 101)
  --- stdout
  running: "sh" "-c" "exec \"$0\" \"$@\"" "autoreconf" "-fi"
  running: "sh" "-c" "exec \"$0\" \"$@\"" "autoreconf" "-fi"
  running: "sh" "-c" "exec \"$0\" \"$@\"" "autoreconf" "-fi"

  --- stderr
  configure.ac:36: warning: macro 'AM_PROG_LIBTOOL' not found in library
  configure.ac:36: error: possibly undefined macro: AM_PROG_LIBTOOL
        If this token and others are legitimate, please use m4_pattern_allow.
        See the Autoconf documentation.
  autoreconf: /usr/bin/autoconf failed with exit status: 1
  thread '<unnamed>' panicked at '
  command did not execute successfully, got: exit status: 1

philip-stoev avatar Sep 14 '22 11:09 philip-stoev

Yeah, requiring autoconf/automake/libtool seems fine! You'll need to update the ci-builder to include those deps too, looks like.

benesch avatar Sep 14 '22 20:09 benesch

@benesch this is ready for another look. I added a third-party crate to compare the actual and expected EXPLAIN PHYSICAL PLAN JSON-s . However, this library panics so I also needed to catch the panic :-(

philip-stoev avatar Sep 16 '22 17:09 philip-stoev

@benesch Thanks a lot for all the PR iterations! Unfortunately there is a yet another surprise -- the bundled version of jq does not build on OSX. Only the jq main source branch from github has the required patch . Should I fork jq-sys and bundle a fixed version of the jq source?

philip-stoev avatar Sep 18 '22 12:09 philip-stoev

@philip-stoev looks like we're still fighting hard with making jq compile on macOS. I do wonder if we might be barking up the wrong tree here. @aalexandrov has proposed to make the output of EXPLAIN nestable in SELECT statements (link).

If we do that, ISTM we could use Materialize/PostgreSQL's JSON operators instead of integrating jq?

Now that I've typed this all up... I'm getting deja vu. I seem to recall that we talked about this way back when, and there was some specific JSON expression that was easy to express in jq but hard or impossible to express in Materialize?

benesch avatar Sep 20 '22 04:09 benesch

@benesch Yes, it is about jq's double-dot operator, which allows you to fish for a particular key regardless of its position within the tree. I believe that tests written that way would be more robust then tests that have a hard-coded path from the root to the desired key.

philip-stoev avatar Sep 20 '22 06:09 philip-stoev

@benesch Thanks a lot for all the PR iterations! Unfortunately there is a yet another surprise -- the bundled version of jq does not build on OSX. Only the jq main source branch from github has the required patch . Should I fork jq-sys and bundle a fixed version of the jq source?

Sorry for not answering your actual question sooner. Yeah, I guess so. I don't see another option.


Although... we could consider adding the .. operator to Materialize directly! Not literally as .., I don't think, but as something in line with the existing ->> options. Would that be crazy?

benesch avatar Sep 22 '22 06:09 benesch

@benesch adding such an operator would be beyond my abilties so needs to be scheduled to a developer ... so I am going for the fork et al since this is something that I can do myself.

philip-stoev avatar Sep 23 '22 09:09 philip-stoev

This is hopeless. Even when using the most recent jq source, the compilation error is only fixed for iOS but not for OSX :-( . I am unable to make any further progress on this. @aalexandrov , we will need to go with SELECT (EXPLAIN AS JSON SELECT ...)->some_json_expression_here;

philip-stoev avatar Sep 26 '22 09:09 philip-stoev

Sorry about that, @philip-stoev! Feel free to file a feature request for a recursive descent operator!

benesch avatar Sep 28 '22 02:09 benesch

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Dec 08 '22 17:12 CLAassistant

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Dec 08 '22 17:12 CLAassistant

I would suggest just using something like this in testdrive, works today:

$ set-from-sql var=plan
EXPLAIN PHYSICAL PLAN AS JSON FOR SELECT 1

> SELECT '${plan}'::json->'plans'->0->>'id'::text
"Explained Query (fast path)"

def- avatar Jan 30 '24 17:01 def-