TensorLayer icon indicating copy to clipboard operation
TensorLayer copied to clipboard

Performance issues in examples/

Open DLPerf opened this issue 4 years ago • 3 comments

Hello! I've found a performance issue in tensorlayer/examples: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • examples/quantized_net/tutorial_binarynet_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/quantized_net/tutorial_binarynet_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) shoule be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/quantized_net/tutorial_dorefanet_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/quantized_net/tutorial_dorefanet_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/quantized_net/tutorial_quanconv_cifar10.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/quantized_net/tutorial_quanconv_cifar10.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/quantized_net/tutorial_ternaryweight_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/quantized_net/tutorial_ternaryweight_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/data_process/tutorial_fast_affine_transform.py: dataset = dataset.batch(batch_size)(here) should be called before dataset = dataset.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/data_process/tutorial_tf_dataset_voc.py: ds = ds.batch(batch_size)(here) should be called before ds = ds.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/basic_tutorials/tutorial_cifar10_cnn_static.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/basic_tutorials/tutorial_cifar10_cnn_static.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
  • examples/deprecated_tutorials/tutorial_imagenet_inceptionV3_distributed.py: dataset = dataset.batch(batch_size)(here) should be called before dataset = dataset.map(_map_fn, num_parallel_calls=max_cpus)(here).

Besides, you need to check the function called in map()(e.g., _map_fn called in dataset.map()) whether to be affected or not to make the changed code work properly. For example, if _map_fn needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

DLPerf avatar Aug 20 '21 06:08 DLPerf

thanks, we will have a check asap

zsdonghao avatar Aug 20 '21 06:08 zsdonghao

Hello, How long do you need to confirm this problem? @zsdonghao Thank you~

DLPerf avatar Aug 31 '21 07:08 DLPerf

Sorry! It is too late to reply you. I will modify them and update. @DLPerf

hanjr92 avatar Sep 08 '21 01:09 hanjr92