hhvm icon indicating copy to clipboard operation
hhvm copied to clipboard

Use optimised forms of class release functions where possible

Open david-arm opened this issue 1 month ago • 6 comments

In makeDtorCall we call a release function of the form

void release(ObjectData *obj, Class *cl)

but in some cases we load the second class pointer argument directly from the object pointer. However, we can actually invoke a more optimised version of the release function:

void release2(ObjectData *obj)

and let this release function do the work of extracting the class pointer.

Doing so helps to reduce the total size of the Jit code cache by removing an associated load for every call. It may slightly increase the total instruction execution count by adding an extra level of indirection, but I think it's worth it.

david-arm avatar Nov 19 '25 10:11 david-arm

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D87429159. (Because this pull request was imported automatically, there will not be any future comments.)

meta-codesync[bot] avatar Nov 19 '25 10:11 meta-codesync[bot]

it's been over five years since i worked at this team or company, but surely nobody would object (ha ha ha) to some on-a-whim free labor, right?

it's not apparent to me that the additional call layer is the only potential cost; you may also be incurring an additional dependent load. loading the Class of an object is an lea (on x86 at least) which will likely be eliminated much of the time in larger function bodies if the Class is required elsewhere (which is common). pushing the load down into native code precludes that load elimination, and at no benefit to code size if the Class was already live in a register anyway.

ObjectData::release() requires that the cls argument be the Class of the obj argument. i suspect the reason your version of things was not done in the past is due to the cost of said dependent load. you can probably check this yourself by looking at the code output of some of the test case

Without this patch on AArch64 I see a load of the class pointer from the object (required for the second argument of the original release function) in the Jitted code. That means every instance we generate a call to this runtime function we are also generating an additional load. What this patch does is move that additional load to a single point (in the new release2 function), which reduces the overall size of the Jitted code.

david-arm avatar Nov 19 '25 14:11 david-arm

i was imagining that identical loads might be eliminated if the value is already known to be live in a register. but upon more careful recollection i don't think vasm has any load or memory tracking except for the frame pointer specifically. my memory is of course very obsolete, just tossing the idea out there.

anyway all that said, object destruction has to actually free up memory so i don't see why an extra load would matter; maybe the code size reduction is worth it regardless. someone on the inside would have to notice this PR and perf test it i guess. (part of the reason i commented, idk who's doing that these days if anyone.)

mxw avatar Nov 19 '25 14:11 mxw

I seem to recall that one benefit of the current scheme was that if we knew the Class statically, we could supply it as a constant, saving a load all together.

ricklavoie avatar Nov 20 '25 04:11 ricklavoie

i think that case is handled separately just above where this diff makes its change.

mxw avatar Nov 20 '25 05:11 mxw

Thanks @david-arm for the PR. This passes CI tests internally. We will A/B perf test to see if there any significant changes.

meteorfox avatar Nov 24 '25 17:11 meteorfox

@david-arm We got some perf data on this one. This one reliably regresses on branch-misses by 2% and see about 0.5-1% hit on perf.

meteorfox avatar Dec 10 '25 19:12 meteorfox

@david-arm We got some perf data on this one. This one reliably regresses on branch-misses by 2% and see about 0.5-1% hit on perf.

OK no worries, thanks for testing! It was worth a try. :)

david-arm avatar Dec 11 '25 09:12 david-arm