ucx icon indicating copy to clipboard operation
ucx copied to clipboard

ABI stability within patch releases

Open pentschev opened this issue 9 months ago • 3 comments

Parts of UCX, primarily UCP, make a promise that the public API is forward-compatible. That is very useful when packaging is involved because it allows for consumers of UCX to simply build once and reuse the same binaries even across minor UCX releases. However, this is not the case as an example for UCS, and some consumers do rely on it, such as UCC. This causes a problem for packaging because "rebuild the world" is required even for UCX patch releases.

Would providing ABI compatibility for all modules at least across patch releases of the same minor UCX version be possible?

pentschev avatar May 14 '25 19:05 pentschev

However, this is not the case as an example for UCS, and some consumers do rely on it, such as UCC. This causes a problem for packaging because "rebuild the world" is required even for UCX patch releases.

It is worth noting that both major MPI implementations, MPICH and Open MPI, end up linking directly to libucs as well, when built with the standard flag (--with-ucx). So this definitely not a niche use case, but something that impacts all UCX consumers (through MPI).

leofang avatar May 14 '25 19:05 leofang

However, this is not the case as an example for UCS, and some consumers do rely on it, such as UCC. This causes a problem for packaging because "rebuild the world" is required even for UCX patch releases.

It is worth noting that both major MPI implementations, MPICH and Open MPI, end up linking directly to libucs as well, when built with the standard flag (--with-ucx). So this definitely not a niche use case, but something that impacts all UCX consumers (through MPI).

So I added that link to UCS in MPICH (https://github.com/pmodels/mpich/commit/a9255590caf5302fd169bd7b731f0747d943b4d2), but the build issue I cited I believe was due to an invalid RPATH in libucp.so in HPC-X distribution (https://elist.ornl.gov/mailman/private/ucx-group/2018-September/000715.html). It looks like more recent versions of HPC-X don't set RPATH for libucp, so it might be safe to remove the explicit -lucs and let MPICH pick it up via inter-library dependencies.

raffenet avatar May 15 '25 17:05 raffenet

I guess we still need some guarantees about any public UCS types or data structures used by MPICH, but that should be it.

raffenet avatar May 15 '25 18:05 raffenet