TensorLayer Performance issues in examples/

Hello! I've found a performance issue in tensorlayer/examples: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

examples/quantized_net/tutorial_binarynet_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_binarynet_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) shoule be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_dorefanet_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_dorefanet_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_quanconv_cifar10.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_quanconv_cifar10.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_ternaryweight_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_ternaryweight_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/data_process/tutorial_fast_affine_transform.py: dataset = dataset.batch(batch_size)(here) should be called before dataset = dataset.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/data_process/tutorial_tf_dataset_voc.py: ds = ds.batch(batch_size)(here) should be called before ds = ds.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/basic_tutorials/tutorial_cifar10_cnn_static.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/basic_tutorials/tutorial_cifar10_cnn_static.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/deprecated_tutorials/tutorial_imagenet_inceptionV3_distributed.py: dataset = dataset.batch(batch_size)(here) should be called before dataset = dataset.map(_map_fn, num_parallel_calls=max_cpus)(here).

Besides, you need to check the function called in map()(e.g., _map_fn called in dataset.map()) whether to be affected or not to make the changed code work properly. For example, if _map_fn needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

Aug 20 '21 06:08 DLPerf

thanks, we will have a check asap

Aug 20 '21 06:08 zsdonghao

Hello, How long do you need to confirm this problem? @zsdonghao Thank you~

Aug 31 '21 07:08 DLPerf

Sorry! It is too late to reply you. I will modify them and update. @DLPerf

Sep 08 '21 01:09 hanjr92