cuCollections icon indicating copy to clipboard operation
cuCollections copied to clipboard

Implement OA `retrieve(_outer)` and its `multiset` API

Open sleeepyjack opened this issue 1 year ago • 2 comments

WIP

Closes #465 Closes #489

sleeepyjack avatar Jul 11 '24 00:07 sleeepyjack

The outer test is still failing and the speeddown compared to the previous implementation is still 1.5x. Apart from that, the other unit tests look good. So the natural next steps would be to fix the bug in the retrieve_outer (shouldn't be a big deal) and dive into optimizations. For the latter I could use a second pair of eyes since this kernel is notoriously complex.

sleeepyjack avatar Jul 19 '24 02:07 sleeepyjack

For the latter I could use a second pair of eyes since this kernel is notoriously complex.

Commenting out the code part by part to find the largest bottleneck is probably the most efficient way.

PointKernel avatar Jul 19 '24 17:07 PointKernel