Shane Peelar

Results 120 comments of Shane Peelar
trafficstars

Wow, so here's a somewhat unexpected result: ![image](https://user-images.githubusercontent.com/7820684/184270637-c4a77fe7-c60a-44cf-a3a6-15f2f59359ac.png) I implemented a rough AoSoA representation for comparison (improvements noted and needed -- currently relying on `struct` laying out Vec4s sensibly). I...

> For example, I'm thinking about cases where random access happens, which would require 3x the amount of memory fetches for Vec3s if I understand the data layout correctly? I...

Ah, I hadn't even considered hierarchical relations! Thanks for that :) With that in mind, it may make sense to go for an AoSoA approach in the long run. You...

**ASIDE:** I was a curious why the speedup for the non-AoS layouts wasn't as much as I expected, so I fired up vTune. With 4-wide SIMD, accounting for overhead, I...

Okay so, not all roses as I'd hoped. It looks like more than just the Query types will need to be altered; the Tables themselves might need some notion of...

About midway through the prototype implementation. There's a lot to be improved for sure, but it's coming along. I wanted to note here that the change ticks are a little...

Unfortunately, unaligned loads are not portable across architectures. They also have performance implications where they are supported in some cases, especially if they aren't natively supported. Otherwise I agree that...

Made a good amount of progress. There's a few more changes than I originally anticipated but so far, reads are working and Miri isn't complaining. My working batch type definition...

Another update. Due to the `std::simd` situation and no clear way to query the platform preferred vector width, I've opted to have tables be aligned to (minimum) 16 bytes for...

> Would disabling avx512f and enabling avx512vl, avx512dw and avx512bw instead work? Or do those have a hard dependency on avx512f? Great question. Actually, the behaviour you're after is entirely...