benchmarks
benchmarks copied to clipboard
fix the bug for eval function while variable_update=parameter_server|distributed_replicated
firstly , the _eval function currently doesn't support the mode of 'variable_update=parameter_server' and 'variable_update=distributed_replicated' ,and there will be some mistakes while using the mode of 'replicated' to restore parameters from the checkpoint file that created by training with 'variable_update=parameter_server|distributed_replicated' ,so I changed the 'target' to fix it .
secondly ,while variable_update='distributed_replicated' ,the result of eval function looks not correct. I found that the set of tf.global_variables have no parameters while restoring checkpoint , and even in training ,tf.global_variables() only contained 190+ parameters(these parameters were copied from local_variables and only trainable variables) ,without 'batchnorm/gamma' ,'batchnorm/moving_mean' and 'batchnorm/moving_variance' ,so I changed the code to store/restore parammeters from/to the tf.local_variables and it worked.
@reedwm Can you take a look? I think you are dealing with this internally. I will merge internal to external to get a better version out here and would like to clean up these PRs first. Thank you.