mpich icon indicating copy to clipboard operation
mpich copied to clipboard

Does mpich work over a WAN?

Open linjian-tech opened this issue 1 year ago • 3 comments

We have successfully implemented two nodes communicating via mpi4py (whose backend MPI is mpich) with each other in our LAN. However, when we set both node ip to WAN ip, the client cannot connect to the server (ERROR MESSAGE: HYDU_sock_connect (utils/sock/sock.c:143): unable to connect from "node5" to "node5" (Connection refused) [proxy:0:1@node5] main (pm/pmiserv/pmip.c:181): unable to connect to server node5 at port 46135 (check for firewalls!))). Can you help us with this, thanks!

linjian-tech avatar Dec 19 '23 02:12 linjian-tech

Is there firewall or gateway settings that prevent connection to an arbitrary port?

hzhou avatar Dec 20 '23 01:12 hzhou

Thank you for your answer. We tried to open all ports on the cloud instance, but still it is not working. However, it is fine on the LAN consisting of the cloud instance.

linjian-tech avatar Dec 20 '23 01:12 linjian-tech

unable to connect from "node5" to "node5"

What are the contents of /etc/hosts in "node5"? If you login into node5, can you "ssh node5" or does it fail?

I would try login in all nodes and and then run

echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts > /dev/null

NOTE: I assume you understand the consequences of editing the system /etc/hosts file.

dalcinl avatar Dec 22 '23 20:12 dalcinl

Assume the issue is resolved.

hzhou avatar Apr 10 '24 20:04 hzhou