He Jia

Results 32 issues of He Jia

It seams there is no api about rendezvous in UCP. And there is no clear document or example for UCT.

Including PCI-E, RDMA, TCP/IP and other scenarios, I do not know what kind of test is appropriate.

Seems only clang and nvcc support in rules_cuda

There are 8 cards in one node. Should I create the endpoints for the rest of 7 cards in GPUx? Or I need to use different methods when intra-node GPU...

I'm not sure if I understand this correctly. It can't submit too many nbx to the UCP worker, or it will cause ucp_worker_progress to process too slowly. So is there...

I noticed there are some APIs about epoll in UCX. It's possible to use io_uring?

In XLA, it can be used PJRT API to access XLA kernel in CPP code, which is the implementation of PyTorch XLA backend. It's there any way to access Tile-lang...

For example, one from GPU, and the other from Host.

INFO: Invocation ID: 016e1cd5-c232-40d3-9c1d-dcf15ee690c3 WARNING: Build options --features and --host_features have changed, discarding analysis cache (this can be expensive, see https://bazel.build/advanced/performance/iteration-speed). INFO: Analyzed target @@hedron_compile_commands~//:refresh_all (0 packages loaded, 3516 targets...