jax icon indicating copy to clipboard operation
jax copied to clipboard

Add multi host pjit tests

Open sudhakarsingh27 opened this issue 3 years ago • 7 comments

Added multi host pjit tests to the existing test setup of 2 nodes and 1 process per GPU

sudhakarsingh27 avatar Sep 21 '22 00:09 sudhakarsingh27

I don't think there's any need to repeat the tests here. Why not just subclass the original pjit tests and run them in Slurm?

yashk2810 avatar Sep 21 '22 03:09 yashk2810

Is that straightforward (basically, are the upstream single host pjit tests extensible enough that they could be run on multi-host multi-GPU scenario)? I'm asking this because these tests needed to be adapted with some significant effort from TPU -> GPU case.

I agree that in the longer term the existing test cases should be generic enough to be run on any setup/backend, but for now I think we should get these tests in so that they are testing at least something other than just psum and pmap in the GPU CI.

(Also, @skye mentioned that she was perhaps looking at some of those upstream pjit tests and making them generic?)

sudhakarsingh27 avatar Sep 21 '22 18:09 sudhakarsingh27

After #5498 is complete, We can support most of extended query mode now, including:

  • extended query mode message flow
  • all text format param
  • all text format result
  • all kind of prepare statement with param

There are still some remained type format not be supported, including:

  • [x] binary format param
    • [ ] struct
    • [ ] list
    • [x] interval
    • [x] timestampz
  • [x] binary format result
    • [ ] struct
    • [ ] list
    • [x] interval
    • [x] timestampz

If the client uses text format, extended query mode can be used. If the client uses binary format, except the type still not be supported, the extended query mode can be used too.

yashk2810 avatar Sep 21 '22 21:09 yashk2810

addressed the concerns

sudhakarsingh27 avatar Sep 22 '22 04:09 sudhakarsingh27

fixed lint and doc build issues now

sudhakarsingh27 avatar Sep 22 '22 18:09 sudhakarsingh27

Done with making changes as requested

sudhakarsingh27 avatar Sep 23 '22 00:09 sudhakarsingh27

Few minor comments, after that LGTM

yashk2810 avatar Sep 23 '22 01:09 yashk2810

ping after fixing nits

sudhakarsingh27 avatar Sep 23 '22 19:09 sudhakarsingh27