Wojciech Uss
Wojciech Uss
1. Current API does not allow for getting the size of a remote pool without opening it. Therefore we cannot verify the size of a damaged remote pool without recreating...
A damaged part can also be considered neither a poolset file nor a file with a pool and we still need a means of removing the part, particularly in the...
We are currently working with the following two models: `ernie_quant.zip` and `fp32_model.tar.gz`. Results gathered with the optimized `fp32_model` model will be labeled `FP32`. Results gathered with the fake quantized FP32...
I have just noticed the above results were mixed. They are updated now.
FP32 results with @GaoWei8's fix (https://github.com/PaddlePaddle/Paddle/pull/20972): `Run 5010 samples, average latency: 32.0582 ms per sample.`
Below are our latest performance results for Ernie FP32 and INT8 runs. The tests were run with affinity settings ``` export KMP_AFFINITY=granularity=fine,compact,1,0 export KMP_BLOCKTIME=1 ``` on CLX 6248. INT8 tests...
I have updated the results in https://github.com/PaddlePaddle/benchmark/issues/275#issuecomment-565858479 above. Results with additionally quantized `reshape2` and `transpose2` are added.
@luotao1 , Today we have found that since our last benchmarks there were some changes in the FC INT8 3D support PR's code (PaddlePaddle/Paddle#21746) which influenced the performance. There is...
@luotao1 , The PR with FC INT8 3D is already fixed and updated.
The latest FP32 results for the current clean `develop` branch (`25e765a4fe`) on SKX 6148: * 4-dimensional input (`fp32_model`, `test_ds`): 1 thread: 189.39 ms, 20 threads: 30.20 ms. * 2-dimensional input...