mesh icon indicating copy to clipboard operation
mesh copied to clipboard

Can you go across multiple nodes?

Open tonyreina opened this issue 6 years ago • 2 comments

Is it possible to use devices that are on different machines? For example, in Horovod I can specify the IP addresses of multiple machines and do data parallelism across them. However, this requires me to specifically have MPI setup on each machine. It's unclear to me if this can be done with TF Mesh. Maybe with a tf.train.clusterspec and the parameter server model??

Thanks. -Tony

tonyreina avatar Nov 09 '18 20:11 tonyreina

Did you found a solution for this? @toponado

hpc-unex avatar Jun 01 '20 11:06 hpc-unex

Is there a response to this question?

aprabh2 avatar Mar 30 '22 18:03 aprabh2