LearningTensorFlow icon indicating copy to clipboard operation
LearningTensorFlow copied to clipboard

Performance issues in /Project (by P3)

Open DLPerf opened this issue 4 years ago • 1 comments

Hello! I've found a performance issue in /Project: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • /NeuralMachineTranslation/model/test.py: parsed_dataset.batch(1)(here) should be called before dataset.map(map_func=_parse_data)(here).
  • /LanguageModel/dataset/ptb_process.py: parsed_dataset.batch(2)(here) should be called before dataset.map(map_func=_parse_data)(here).
  • /LanguageModel/model/test.py: parsed_dataset.batch(parameter.BATCH_SIZE)(here) should be called before dataset.map(map_func=_parse_data)(here).
  • /LanguageModel/model/train.py: .batch(parameter.BATCH_SIZE)(here) should be called before dataset.map(map_func=_parse_data)(here).

Besides, you need to check the function called in map()(e.g., _parse_data called in dataset.map(map_func=_parse_data)) whether to be affected or not to make the changed code work properly. For example, if _parse_data needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

DLPerf avatar Aug 29 '21 13:08 DLPerf

Hello, I'm looking forward to your reply~

DLPerf avatar Nov 04 '21 09:11 DLPerf