aeron
aeron copied to clipboard
RDMA Transport
With additions from the IPC work, there's a direct path to put Aeron on RDMA, allowing Aeron to operate in HPC/Data-centric pipelines. This also gives Aeron a path forward for kernel-bypass operations.
Things to consider:
- JXIO provides an immediate path to RDMA via JNI
- Aeron's C API could be used to effectively recreate JXIO, but tailored specifically for Aeron
- The C API also opens the door for other bypass technologies (DPDK, netmap, custom networking stacks, etc.)
I'm more than happy to help out with this work, if there's interest
👍
The concept that we originally thought of for RDMA was to use the memory region directly and see if we could emulate the logbuffer memory semantics directly. However, the RDMA work was put on hold. There is a lot of potential here that would be worth exploring at some point.
If there is interest, though, especially sponsored interest, please let @mjpt777 and/or I know.
The other week I had a play bridging aeron between hosts using RDMA, and got some very promising looking latency and throughput numbers. My tests out-performed the mediadriver using solarflare under onload, but i'm using both older mellanox RoCE cards, and older solarflare cards, so it's not representative of what is possible with the current generation of cards. Solarflare keep chopping 500ns off the latency with each generation - there's going to be negative latency soon if they keep this up ;-)
For ease of implementation I used RC connections, and IB messaging not direct RDMA. Realistically this means that my test would be slower by an IPC write on the target, and there's obviously another memory copy implied by this (as my client is receiving a message to be offered to aeron). The benefit was that it was a doddle to get something up and running quickly to test the principal.
We have plans to produce a native media driver and then add ef_vi support for Solarflare.
@mjpt777 Anywhere we can track this? We are using aeron already in our ML stack and trying to figure out a viable path to this.
Once the C based media driver is complete we will be doing some work on this. Access to environments with Solareflare NICs and high end switches would be appreciated for testing.
I was exploring using Solarflare TCPDirect with the C media driver. It is very straight-forward to port the existing UDP code to it. I've really enjoyed using this API and was gonna stab it once you guys release.
@neomantra if you have access to Solarflare environment, would be great to talk to you. Want to support ef_vi directly with the C media driver. Feel free to contact @mjpt777 and I directly.
Any update on ef_vi with C media driver?
Still on the list. I've had some hardware issues, but mainly have been busy with other features. Will get back to ef_vi once Clustering is more feature-ful.
ef_vi support is available for the C media driver as a premium commercial offering. Contact [email protected]