CTR
CTR copied to clipboard
Performance issue in utils.py (by P3)
Hello! I've found a performance issue in utils.py: .batch(MODEL_PARAMS['batch_size'] )(line 72) should be called before .map( parse_example_helper_csv, num_parallel_calls=8 )(line 46), which could make your program more efficient.
Here is the tensorflow document to support it.
Besides, you need to check the function parse_example_helper_csv called in .map( parse_example_helper_csv, num_parallel_calls=8 ) whether to be affected or not to make the changed code work properly. For example, if parse_example_helper_csv needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z) after fix.
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
@DLPerf Thanks for point this out! Honestly I haven't pay much attention to performance before >< I just took a look at that performance doc, and found there are actually multiple ways to speed up the tf.data ^O^ !
- speed up data transformation
- sequential mapping -> parall mapping, by using the num_parallel_calls that I already used in the code
- scalar mapping -> vectorized mapping, by using batch before map as you proposed
- speed up data extraction: sequential extraction -> parallel extraction, by using interleave. But in order to use this, I think we need to chunk train sample into multiple tf records in advance ?
- parallelize above ops with training, by using prefetch.
Maybe we can add the other 2 also? Currenly I am indeed not available to manage this repo, could you please help me fix this? Much appreciated!