snmalloc icon indicating copy to clipboard operation
snmalloc copied to clipboard

Use the custom memcpy for realloc.

Open davidchisnall opened this issue 3 years ago • 3 comments

This wraps our memcpy with some assumptions that let the optimiser know that we're copying chunks that are strongly aligned. With clang 13 on x86, this generates three variants:

  • A special case for 16 bytes that's a single vector load + store.
  • A vector-copy loop for sizes <512 bytes.
  • rep movsb for larger sizes.

This is almost certainly faster than the platform memcpy (if for no other reason than that it doesn't have to care about handling unaligned copies).

I don't know if it will show up in benchmarks, but if it does then it Fixes #154

davidchisnall avatar Mar 16 '22 14:03 davidchisnall

@nwf, the memcpy is currently incorrect for CHERI. It's probably worth tweaking the default one for any platform where the AAL says that it's CHERI and a specialising it for Morello...

davidchisnall avatar Mar 16 '22 14:03 davidchisnall

So running some of my usual benchmarks does not show any statistically significant difference. I am still happy to take this. I wonder if it makes sense to have a micro-benchmark to test a collection of reallocs to see if this is a win in an artificial scenario.

mjp41 avatar Mar 16 '22 19:03 mjp41

If it isn't a bottleneck then it's probably not worth the code size increase. It's small but a lot of small things add up.

davidchisnall avatar Mar 17 '22 09:03 davidchisnall

If it isn't a bottleneck then it's probably not worth the code size increase. It's small but a lot of small things add up.

I'll close, we can always revisit.

mjp41 avatar Mar 23 '23 14:03 mjp41