exo icon indicating copy to clipboard operation
exo copied to clipboard

Thunderbolt 5 mesh of Mac Minis

Open fkorotkov opened this issue 1 year ago • 5 comments

Hi there,

I've just got 3 M4 Pro Minis to experiment with MLX, exo, etc. One thing I noticed that when I connect all of them in a ring:

graph TD
    Mac1[Mac Mini 1] <--> Mac2[Mac Mini 2]
    Mac2 <--> Mac3[Mac Mini 3]
    Mac3 <--> Mac1

is that not all the Thunderbolt connections are active (in my example above connection between 1 and 3 is not active even though there is a cable between them). Looks like Apple is not allowing cycles in the topology. Does anyone have experience of running multiple Minis with Thunderbolt 5 between them?

I'm measuring though via iperf and it seems each hop between Minis affects throughput. I can imagine if it's not possible to have cycles in the topology then this solution might not be that scalable.

Will appreciate any insights from folks that have experience with many Minis and Thunderbolt.

fkorotkov avatar Dec 02 '24 14:12 fkorotkov

Also interested in a ring network topology using thunderbolt 5

ivahno avatar Dec 02 '24 17:12 ivahno

I'm fascinated by this project too. I'm hoping to build up a cluster in the next month. You've probably already seen this @fkorotkov, but just in case, did you follow the Thunderbolt connection process here?

Looking forward to hearing about your progress.

chachwick avatar Dec 03 '24 14:12 chachwick

@fkorotkov Fedor - when you add the thunderbolt connects the Mac mini's span a IP network. You probably gave them IP addresses in one private network (10.0.0.0/8) or maybe (192.168.0.0/16). that means that one network is used to send packets to your 3 Mac minis - meaning the additional thunderbolt connect is not used because the 1st thunderbolt connect takes precedence via the default route. You will need to tell the two Mac minis with the additional direct thunderbolt connect that for both of those the Mac on the other side should be reached via the 2nd thunderbolt bridge - meaning you'll have to add a direct route that sends packets via your 2nd thunderbolt device.

OhJayGee avatar Dec 16 '24 10:12 OhJayGee

I have the same problem. But in my case, when I connected mac3 and mac1 using Thunderbolt 5, the entire mesh network became very unstable. When I ping mac3 from mac1, it gets time out sometimes. I suspect there is network congestion or a network storm. @OhJayGee @fkorotkov Do you know how to resolve this? Thanks!

iseanwang avatar Jan 14 '25 06:01 iseanwang

@OhJayGee gave the correct answer. Go to "manage virtual interfaces" and create a separate bridge network for each thunderbolt port. By default, it throws all the ports into a single bridge. Then manually assign the same static IP for both bridges. Do this for each node and they should all be able to ping each other through the direct bridge instead of hopping the loop

Image

khant14 avatar Jan 17 '25 02:01 khant14

What about 4 mac mini and Thunderbolt connections among all of them (6 cables). Is it like this:

Macmini1: 3 seperate bridge networks for each of the used thunderbolt ports with the same 1 static IP (of the Macmini) Macmini2: 3 seperate bridge networks for each of the used thunderbolt ports with the same 1 static IP (of the Macmini) and so on .....

Then, I would have 12 seperate bridge networks over all Macmini and 4 static IP Adresses. On the other hand, there are 6 peer-2-peer connections (cables), each of it associated with 2 bridge networks (for example MacMini 1-2 and 2-1).

Right?

dirksaller avatar Mar 07 '25 11:03 dirksaller