anpark issues

Repositories
Issues
Comments

Results 3 issues of


                                            anpark

Fleet ps on paddlecloud coredump when stop

env: paddle 1.5.1 with fleet， 1 ps, 2 trainers, use dataset ps and worker0 stop success, but worker1 coredump trainer failed, exit_code=134 pure virtual method called terminate called without an...

Make sure only chief worker can add global_step

https://github.com/tensorflow/benchmarks/blob/2389369f6b5c9d3241676a728b450e47482966c0/scripts/tf_cnn_benchmarks/benchmark_cnn.py#L1579 why not change global_step update op for #199 ? Make sure only chief worker can add global_step like tf.train.SyncReplicasOptimizer @alsrgv @reedwm

How to split dataset by multi workers?

HI, if i have 5 parts input dataset in hdfs, then if i use 5 workers to train 2 epochs i think worker 0 read part-0 2epochs, worker 1 read...