dlbench icon indicating copy to clipboard operation
dlbench copied to clipboard

[BUG] AlexNet, ResNet scripts of TensorFlow use wrong data path

Open shishaochen opened this issue 7 years ago • 4 comments

From tools/tensorflowbm6_gpu21.config and http://dlbench.comp.hkbu.edu.hk/, we know dlbench of version 8 will call alexnet and resnet for CNN tests.
But, Cifar10 experiments won't succeed as the data path is wrong in scritps.

# From tools/tensorflow/cnn/alexnet/alexnet_cifar10.py and tools/tensorflow/cnn/resnet/resnet_cifar10.py
tf.app.flags.DEFINE_string('data_dir', os.environ['HOME']+'/data/tensorflow/cifar10/cifar-10-batches-bin', """Data directory""")

I manually download the data file tensorflow.zip and make a test. The correct path should be: ~/data/tensorflow/cifar10/cifar-10-batches-py By the way, if will be appreciated if you can dynamically set the data path instead of using user home directory.

shishaochen avatar Sep 19 '17 08:09 shishaochen

Thanks for your suggestion. We assume that users put their data in the directory of $HOME/data/.

shyhuai avatar Sep 19 '17 17:09 shyhuai

@shyhuai I mean the directory name should be "cifar-10-batches-py" rather than "cifar-10-batches-bin" at https://github.com/hclhkbu/dlbench/blob/master/tools/mxnet/train_cifar10.py#L32.
Besides, '$HOME' is the same as "~" in Linux Shell/Python or you can set it to "$HOME/data/tensorflow/cifar10/cifar-10-batches-py".

shishaochen avatar Sep 19 '17 22:09 shishaochen

@shishaochen cifar-10-batches-py is used for the new input method, which was added for faster data reading, while cifar-10-batches-bin is also needed when set --dataset to False. We will unify these two soon.

shyhuai avatar Sep 20 '17 05:09 shyhuai

@shyhuai As we call the benchmark.py you provide, the error occurs. It is grateful that you can make the pipeline of "benchmark.py -> xxxbm.py -> t.sh -> network.py" correct and smooth.

shishaochen avatar Sep 20 '17 05:09 shishaochen