Jongsoo Park
Jongsoo Park
Please set conv_mode to DIRECT_SCONV (example: https://github.com/IntelLabs/SkimCaffe/blob/27df6a8796a012da722c3e2673739350133c1779/models/bvlc_googlenet/test_direct_sconv.prototxt#L144)
BTW, can you share how you fixed the link error for undefined symbols in protobuf? protobuf 3.9.0 didn't work. Which version worked for you?
Also, 50% is not high enough sparsity to get noticeable speedup. I'd first try with a higher sparsity like 90%.
Yes. Please see https://github.com/IntelLabs/SkimCaffe/blob/intel_scnn/src/caffe/util/math_functions_intel.cpp for what shapes are optimized. BTW, please first test performance with just a single thread because thread scalability may not be optimized for all settings.
BTW lower sparsity means more non-zeros so it's expected to see perf drops with lower sparsity.
@amylittleyang can you close this PR since it's abandoned in Phabricator?
Hi Ricarrdo, yes in fact CSR is the default (and actually the only matrix format) currently implemented in SpMP. I have SELLPACK implementation in my private repository that has better...
Hi Riccardo, it's my bad to misunderstand your question. You can construct an instance of SpMP's CSR class with the three arrays like `CSR A(m, n, rowptr, colidx, values); //...
Can you tell me specification of the machine and the compiler you used? Did you specify thread affinity as in the comment of the tests? Single-thread performance should be more...
I'd start with a single-thread runs with OMP_NUM_THREADS=1 and please also specify thread affinity to make it more consistent. Then, you can increase the number of threads but I'd stay...