benchmarks
benchmarks copied to clipboard
PerfZero without Docker fails with AccessDeniedException
raise Exception('"{}" failed with code:{} and stdout:\n{}'.format(
Exception: "['gsutil', '-m', 'cp', '-r', '-n', 'gs://tf-performance/auth_tokens/benchmark_upload_gce.json', '/home/hotaru/tensorflow-benchmarks/perfzero/workspace']" failed with code:1 and stdout:
AccessDeniedException: 403 [email protected] does not have storage.objects.list access to tf-performance.
CommandException: 1 file/object could not be transferred.
I also get this. Any way to fix?
Same here, it looks like an issue on the bucket configuration (IAM)
I have moved on from the project but I created the tool with another person about a year ago. Lowering your expectations here for an answer but I know what is likely happening.
The tool is trying to download an authentication token, which is only used by our the TensorFlow testing/performance team and then uses that to upload results and access data.
I do not know if they changed anything but I put this in the guide to address the problem:
The key is you need to pass an empty arg for the gcloud_key_file_url that will tell it not to try and pull one down. I hope this helps. You can also find that line of code and just remove it but I am pretty sure this flag still works. I am kind of sad I moved on from the project I really enjoyed it and looked forward to a day when people moved from tf_cnn_benchmarks (also a tool I was involved with) to this tool that I very much enjoyed creating and growing. Good luck.
python3 /workspace/perfzero/lib/benchmark.py \
--git_repos="https://github.com/tensorflow/models.git" \
--python_path=models \
--gcloud_key_file_url="" \
--benchmark_methods=official.benchmark.keras_cifar_benchmark.Resnet56KerasBenchmarkSynth.benchmark_1_gpu_no_dist_strat
Keep in mind they might have moved the test, but here is the source file
This command "should" work as-is because it does not use any data. Another problem you will run into is needing to stage the data where perfzero (the test actually) can find it. It is not too hard but because I could not share the source data, e.g. imagenet, it was something I had to kind of gloss over. It is not as hard as I am making it sound. /data/imagenet I think was common or /data/cifar10.
Here is the source file for all the CIFAR10 tests: https://github.com/tensorflow/models/blob/master/official/benchmark/keras_cifar_benchmark.py
At the top you see CIFAR_DATA_DIR_NAME = 'cifar-10-batches-bin'
That will get concatenated to the --root_data_dir=$ROOT_DATA_DIR
arg that you pass. So something like: /data/cifar-10-batches-bin is where the data needs to be if you are not running a synthetic test. The README has some decent coverage of these args, but I want to be clear I know it would be almost impossible for you to match the error you saw with needing to pass a blank arg. I just want to let you know we spent a bit of time trying to document the args, which does not mean we were successful but I/we tried. :-)
These tests are really good because they run "normal" TensorFlow code and there is a team that is maintain them to ensure the models are 100% correct.