dpctl icon indicating copy to clipboard operation
dpctl copied to clipboard

Indexing performance

Open npolina4 opened this issue 2 years ago • 2 comments

import dpctl.tensor as dpt
a = dpt.ones((8192, 8192), device='cpu', dtype='f4')
b = dpt.ones((8192, 8192), device='cpu', dtype=bool)
%timeit a[b]
#211 ms ± 6.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

import numpy
a_np = numpy.ones((8192, 8192), dtype='f4')
b_np = numpy.ones((8192, 8192), dtype=bool)
%timeit a_np[b_np]
#87.1 ms ± 2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

npolina4 avatar Jun 14 '23 23:06 npolina4

This should be improved by changes in gh-1300. @npolina4 could you please post timeit results on the same machine you used to obtain reported numbers in the original comment?

oleksandr-pavlyk avatar Jul 27 '23 12:07 oleksandr-pavlyk

Result with changes in https://github.com/IntelPython/dpctl/pull/1300 Size: 8192, 8192 numpy: 105 ms cpu: 205 ms gpu: 115 ms

Size: 4096, 4096 numpy: 24.5 ms cpu: 45~80 ms gpu: 21.4 ms

npolina4 avatar Jul 28 '23 17:07 npolina4