ucx icon indicating copy to clipboard operation
ucx copied to clipboard

New transports in UCX (SISCI transport for UCX)

Open Hilmarr opened this issue 9 months ago • 2 comments

Hi,

I posted a discussion thread (a question) for what is a good roadmap to support a new transport in UCX here: https://github.com/openucx/ucx/discussions/10530

I'm trying to get in contact with someone to get the ball rolling on this one. I'm currently working on a prototype, and the idea is that it can eventually be pushed upstream so we can add support for our transport with UCX.

I've written about it in the discussion thread, so I won't repeat all of it here, but instead I'll just mention what I forgot to include in that discussion thread. The underlying technology is connecting computers with PCIe (no software overhead for PIO).

The computers are connected with something called non-transparent bridging, such that they can send packages to each other's memory in a similar manner as if the other computer's RAM was a device plugged directly into the motherboard. The remote machine's memory is therefore accessed in a similar manner to how you would access a graphics card or a similar device plugged directly into a PCIe slot. We have a software stack that handles the lower level details, and then customers can program applications using the SISCI API. So the idea is then to implement a UCX transport that uses the SISCI API.

This technology is mainly used for small-to-medium sized, very high performance, clusters

Hilmarr avatar Mar 17 '25 13:03 Hilmarr

Hi @Hilmarr , please see our guidelines for contributors which describe the procedures and expectations on pull requests in the repo. Notice the CLA signing requirement at the top of that page.

The prototype you mentioned seems to be based on a very old UCX version. Therefore, you need to rebase it on current code first. When it is ready, feel free to open a PR and we'll review it.

gleon99 avatar Mar 18 '25 15:03 gleon99

Okay, thank you @gleon99. I'll get started on that. I've managed to get the prototype more up-to-date and cleaned up the code a bit.

I'll keep this issue open, since I think I can use it for my pull request later

Hilmarr avatar Mar 18 '25 16:03 Hilmarr