CensoredUsername comments

Results 214 comments of


                                            CensoredUsername

Language dialect reference for AArch64 has some stale x86/x86-64 references

That should fix the stale references. I've kept static_reg_name and dynamic_reg_name as those are actually a thing (particularly, the X/XSP distinction is important). I also documented label reference offsets which...

Allow reserving space in various Vec and HashMaps for performance

Hmm, if you guys are running into this being an issue, then you are probably creating and destroying significant amounts of VecAssemblers. I'm guessing one per function created? Reserving capacity...

Allow reserving space in various Vec and HashMaps for performance

@mkeeter if you're noticing too much mprotect overhead with the stock assemblers the best thing to look at is reducing the amount of time you call commit (you don't _have_...

Allow reserving space in various Vec and HashMaps for performance

These changes are now live on dev, would anyone be able to provide some benchmarks to judge if there's still some performance left on the table? (particularly, how much of...

Allow reserving space in various Vec and HashMaps for performance

@mkeeter ah yeah that's a pain on aarch64. I'm still working on proper handling of that myself (see the next_major_release branch). It is quite the pain, especially since you have...

Allow reserving space in various Vec and HashMaps for performance

Interesting. Is this just with a drop-in replacement (different hashing algorithm) or is this including reuse of VecAssemblers?

Allow reserving space in various Vec and HashMaps for performance

Neat, that speedup was then just the improved hasher. The stock `Assembler` shouldn't have any perf issues due to allocations to begin with as it was always reusing them.

Allow reserving space in various Vec and HashMaps for performance

>The only thing that makes me sad is that I’ll have to figure out a way to reuse the VecAssembler within rayon, which’ll be… fun… but that is my problem...

omit known zero offset

The crate is built with the idea in mind that instruction sizes (and the generated instruction) should be predictable from the code, to allow things like instruction patching / hotswapping....

Consider using `MAP_JIT` on macOS

Interesting, seems like Apple probably also ran into perf bottlenecks with memory protection swapping due to needing to JIT x64 code, so they added a thread-based switch that alters the...