scikit-learn_bench icon indicating copy to clipboard operation
scikit-learn_bench copied to clipboard

lot of memory allocations becomes bottleneck

Open jkr0103 opened this issue 3 years ago • 4 comments

I captured perf data for most of the algorithms and see there are lot many memory allocations happens during the run which become bottleneck. Please refer attached screenshot.

Is there a way to fine tune the memory allocations? like any env variable or cmmandline arguments?

perf data for nusvc

jkr0103 avatar Nov 09 '22 06:11 jkr0103

Memory allocations are expected especially during generation of synthetic datasets. I can't add anything else without knowledge of what are you exactly running on this screenshot. There is no variable or argument to control it.

Alexsandruss avatar Nov 10 '22 09:11 Alexsandruss

Is it possible to split the "generation of synthetic datasets" and "actual benchmark execution" between two processes. My case is I am trying to run these benchmarking algorithms in SGX using gramine where we have memory constraints. Hence would like to know if synthetic datasets can be generated separately so that we do only benchmarks execution inside SGX.

jkr0103 avatar Nov 11 '22 05:11 jkr0103

sorry closed it by mistake

jkr0103 avatar Nov 11 '22 05:11 jkr0103

Would be addressed with pre-fetch capability in this PR -https://github.com/IntelPython/scikit-learn_bench/pull/133

napetrov avatar May 16 '23 11:05 napetrov