Maratyszcza comments

Results 229 comments of


                                            Maratyszcza

Reproducing NNPACK numbers on SKL i5-6600K

@ngaloppo The timings are per batch. The parameters of the networks are from `cpu` branch of `convnet-benchmarks`. Please see #2 for Maratyszcza/NNPACK#9 for details. Backward pass is not supported and...

Reproducing NNPACK numbers on SKL i5-6600K

I changed the defaults in Caffe.proto and recompiled Caffe for each algorithm

Reproducing NNPACK numbers on SKL i5-6600K

@ngaloppo Do you use prototxt from `convnet-benchmarks`? Specifications from other sources (e.g. Caffe model zoo) may have different image sizes or numbers of channers in hidden layers.

Reproducing NNPACK numbers on SKL i5-6600K

@anijain2305 NNPACK would use `OMP_NUM_THREADS`, if the variable is set, or all virtual threads if it is not specified.

Reproducing NNPACK numbers on SKL i5-6600K

@wangxi123 If you want to reproduce results from README, **don't** use `--enable-psimd` options

Reproducing NNPACK numbers on SKL i5-6600K

@wangxi123 When you add `engine: NNPACK`, Caffe would use NNPACK implementation. If NNPACK is configured **with** `--enable-psimd`, it would be a generic small-SIMD implementation using SSE2. If you configure NNPACK...

Reproducing NNPACK numbers on SKL i5-6600K

@wangxi123 `WINOGRAD` algorithm is implemented only for 3x3 kernels. `AUTO` will choose an algorithm automatically, among FFT, Winograd transform, and implicit GEMM.

Reproducing NNPACK numbers on SKL i5-6600K

@wangxi123 In the current implementation of most convolution functions in NNPACK you need quite large batch size to get speedup (at least 128, better 256). No that it doesn't affect...

Ubuntu 18.04 linux stuck on test 3

This test checks all `2**32` possible input values, so most likely you just need to wait for longer.

A prior art reference

Thank you for the link. We'll mention it in later revisions of the paper, but the operation is not quite the same as FPADDRE. I don't fully understand the "young...