Paulo Valente
Paulo Valente
@liamkillingback any updates here?
> @polvalente all good! Let me test this locally on CUDA later today before merging!
I think this might be possible if we add a special kind of node for representing the unrepresentable StableHLO operations. My main concern is about how regions can nest --...
I took a cursory look into PjRtBuffer and I didn't find anything related to memory layout offsets, unfortunately. Maybe there's a way to set a tiled layout to work like...
One idea, which is not zero-copy, but is at least GPU-to-GPU, is that you could forcibly construct a copy of the tensor without the padding with CUDA calls, and then...
@lawik I did just notice your xla archive download logged "Downloading a precompiled XLA archive for target x86_64-linux-gnu-cpu". You can force it to download the aarch64 archive with the XLA_TARGET_PLATFORM...
I assume aarch64 is the target because it's a nerves host compilation targeting rpi4
I did notice the image uses a rather outdated combo of Elixir and OTP, as well as an older Debian. If possible, I'd update to eliminate any possibility of the...
https://github.com/elixir-grpc/grpc/blob/ab18c938ba9961002ad23d760f284836caf17a81/test/grpc/integration/client_interceptor_test.exs#L61 The feature is there already! I'll change the issue to improve documentation on this. We should probably add a guide with interceptors on both ends.
Noting that this is lacking actual documentation. The Client Interceptor module points to Stub, but Stub just cites the option without explanation: https://hexdocs.pm/grpc/GRPC.Stub.html#connect/2