Benchmarks
Benchmarks copied to clipboard
Stash datafile numpy arrays and concatenate once
Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list.
Concatenate all the xt and yt arrays after all datagen frames have been processed, to trigger memcopy only once.
Before this patch, p2b1_baseline_keras2.py on Haswell (Cooley at Argonne - E5-2620v3 x2, 384 GB RAM, K80 GPU) runs in 4590 seconds
After this patch, it runs in 3555 seconds, for a ~23% speedup.
In situations with limited memory bandwidth (such as when using Optane DC Memory, or external memory via the RAN project at Argonne), this would have a significantly higher impact.