[BUG] mlx-community/gpt-oss-120b-MXFP4-Q8 stuck in loading/failed loop
Describe the bug
After updating to the latest MacOS app, I can no longer create an instance using this model using Tensor and RDMA.
To Reproduce
Steps to reproduce the behavior:
- Select mlx-community/gpt-oss-120b-MXFP4-Q8
- Select Tensor and RDMA
- Click Launch
Expected behavior
The instance launches.
Actual behavior
It gets stuck in a loop of loading, failed, unknown.
Environment
- macOS Version: 26.2
- EXO Version: Latest
- Hardware:
- Device 1: M4 Max Mac Studio
- Device 2: M4 Pro Mac mini
- Interconnection:
- TB5 between both Macs.
https://github.com/user-attachments/assets/a0b9fc0e-e0b6-447d-a54a-ffbb89af804f
I’m running exo on two Mac Studios (macOS Tahoe 26.2, Thunderbolt 5, RDMA enabled).
I noticed that:
Pipeline and MLX Ring modes allow selecting 2 nodes and work as expected.
But when I select Tensor or Tensor + MLX RDMA, the UI only allows 1 node (minimum nodes is locked to 1).
Both machines can run exo individually, models are synced, and RDMA is enabled via rdma_ctl enable.
Is this a current limitation of exo’s Tensor/RDMA implementation, or is there something missing in my setup? Has anyone been able to use Tensor + RDMA with multiple nodes?
Sorry @nackerr @aaronysl, it looks like 1.0.61/1.0.62 introduced some serious regressions. For now I'd recommend using 1.0.60 which you can download here: https://assets.exolabs.net/EXO-1.0.60.dmg
Thank you for reporting the bug - it helps a lot. We are working on a fix in 1.0.63.
Heya - we don't have tensor OSS support in .60 but it should be supported once #1144 lands
Should be fixed in the next build!