CTR Performance issue in utils.py (by P3)

Performance issue in utils.py (by P3)

Open DLPerf opened this issue 4 years ago • 1 comments

Hello! I've found a performance issue in utils.py: .batch(MODEL_PARAMS['batch_size'] )(line 72) should be called before .map( parse_example_helper_csv, num_parallel_calls=8 )(line 46), which could make your program more efficient.

Here is the tensorflow document to support it.

Besides, you need to check the function parse_example_helper_csv called in .map( parse_example_helper_csv, num_parallel_calls=8 ) whether to be affected or not to make the changed code work properly. For example, if parse_example_helper_csv needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z) after fix.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

Aug 27 '21 12:08 DLPerf

@DLPerf Thanks for point this out! Honestly I haven't pay much attention to performance before >< I just took a look at that performance doc, and found there are actually multiple ways to speed up the tf.data ^O^ !

speed up data transformation

sequential mapping -> parall mapping, by using the num_parallel_calls that I already used in the code
scalar mapping -> vectorized mapping, by using batch before map as you proposed

speed up data extraction: sequential extraction -> parallel extraction, by using interleave. But in order to use this, I think we need to chunk train sample into multiple tf records in advance ?
parallelize above ops with training, by using prefetch.

Maybe we can add the other 2 also? Currenly I am indeed not available to manage this repo, could you please help me fix this? Much appreciated!

Aug 27 '21 23:08 DSXiangLi

CTR CTR copied to clipboard

Performance issue in utils.py (by P3)

CTR
CTR copied to clipboard