Tom Birch
Tom Birch
I tried `--host_crosstool_top=@llvm_toolchain//:cc-toolchain-x86_64-linux` but that gives me ` external/bazel_tools/tools/cpp/BUILD:63:11: in :cc_toolchain attribute of cc_library rule @bazel_tools//tools/cpp:malloc: '@llvm_toolchain//:cc-toolchain-x86_64-linux' does not have mandatory providers: 'CcToolchainInfo' `
Ah I need to pass `--noincompatible_enable_cc_toolchain_resolution` for my wasm build, otherwise it tries to use clang to compile c++ files for wasm, and this disables llvm_toolchain
@cybersoulK this is just a proof of the bug, In my actual use case I need 20 bytes of data per vertex
@cybersoulK can you prove there's automatic padding injected for `(vec3,f32, u32)`? In the example I provided it's very clear that the shader is striding 20 bytes per index into `v20`....
https://gist.github.com/froody/f1d4ec656a2110191ea1618187806ba1
@psychocrypt gcc doesn't work either: nvcc fatal : GNU C/C++ compiler is no longer supported as a host compiler on Mac OS X.
Apparently the solution is to add `os.environ["TP_SOCKET_IFNAME"] = "enp1s0f0"`
Do you mean torch.cuda.set_device()? If so then yes. I also changed torch_ucc to use `cudaGetDevice` in `ProcessGroupUCC::progress_loop` instead of hard-coding device 0.
Any updates on this? There seem to be some blosc2 plugins availlable (eg pytables) but none support arbitrary filters as far as I can tell. I need BYTEDELTA to get...
I've seen that before, how does that answer my question? Is hdf5 going to adopt that proposal?