graph-learn icon indicating copy to clipboard operation
graph-learn copied to clipboard

When multiple PS servers are used, How to set parameter "emb_save_dir" when using multiple PS servers?

Open pangedeshijie opened this issue 5 years ago • 2 comments

I have specified the same local directory on each PS server, but the result is not good. Should this parameter be set to a public HDFS directory?

Thanks!

pangedeshijie avatar Sep 03 '20 12:09 pangedeshijie

Yes, a distributed file system is needed. You can launch a distributed program on a local machine, or mount NFS to local directory to establish a real distributed model.

Currently, HDFS is not supported.

jackonan avatar Sep 07 '20 06:09 jackonan

@jackonan Thank you very much for your reply. Currently, HDFS is not supported, how can I use multiple PS? My embedding can't be saved with a PS.

pangedeshijie avatar Sep 07 '20 09:09 pangedeshijie

Actually, save model to HDFS is supported by TensorFlow.

baoleai avatar Oct 14 '22 08:10 baoleai