mmtk-core icon indicating copy to clipboard operation
mmtk-core copied to clipboard

Never use side metadata for LOS-local metadata

Open steveblackburn opened this issue 1 year ago • 4 comments

LOS-specific metadata should be either in the vm header (if the VM allows it), or in an MMTk-owned header word allocated before the vm object.

This is because side metadata introduces a spatial tax. That tax is entirely avoidable in the case of the LOS, and since the LOS constitutes a lot of the used heap space in many applications, the impact of this can be significant.

The tradeoff is quite different for LOS objects (where the addition of an extra word in front of an object is a small overhead), compared to small objects (where an extra word is an expensive overhead to be avoided).

For global metadata (applied uniformly to all objects), this is not an option. But for LOS-specific metadata this is an error.

Using side metadata for LOS objects is thus a performance bug, and the fix is fairly straightforward.

steveblackburn avatar Jun 07 '23 06:06 steveblackburn

In the current code, LOS uses 2-bit local metadata, and the binding can choose to use header bits or side metadata.

It could be put to the object header, which takes no extra space.

It could also be put to the side metadata, which is 2 bit per page (2 bit per 4k memory). If we use a header word per object (64 bits per object -- assuming the word size is 64), it will be more space efficient only when the average object size for large object is larger than 128k, and it will be less space efficient when the average object size is smaller. Currently Immix plans define the large object threshold as 16K, and the other plans (copying plans and mark sweep) define the threshold as 64K. In theory, using a header word may not be more space efficient than side metadata. It will depend on the language and the workload.

And the actual difference between the two options are not significant. When we have 32GB of LOS, the side metadata only takes 2MB, and we can only save up to 2MB by using a header word for LOS. When using a header for LOS, the extra memory would take from 8 bytes (1x32GB object) to 16MB (worse case 2Mx16KB objects).

Current: VM header Current: side metadata Proposed: extra header
Metadata cost for 32GB LOS 0 2MB 8 bytes to 16MB

I don't think using an extra header word is clearly better in terms of space efficiency. There may be other reasons that we want to switch to use a header word for LOS, but they are not listed in the issue itself.

qinsoon avatar Jan 01 '24 23:01 qinsoon

I think your analysis is right, @qinsoon. It is not "clearly better" to put the metadata in the header.

I'm going to close this issue.

steveblackburn avatar Jan 23 '24 08:01 steveblackburn

I'm not too sure whether we should just close this. I think we need to consider both the space overhead and spatial locality.

In the original comment, I'm not quite sure which one or both was Steve referring to as "spatial tax."

caizixian avatar Jan 23 '24 10:01 caizixian

I'm not too sure whether we should just close this. I think we need to consider both the space overhead and spatial locality.

In the original comment, I'm not quite sure which one or both was Steve referring to as "spatial tax."

The issue was reopened for this comment, but changed to low priority.

The main concern is that performance overhead coming from spatial locality. We allow metadata to be in the header, or on the side. If the binding uses side metadata, there will be a penalty for worse locality -- this happens for all the spaces. However, for LOS, it is a bit different, as we can afford to always use a header (we cannot do so for other spaces). So the question now is that if we should always use a header for LOS for possibly better performance. It is unknown yet. So the issue was changed to low priority.

qinsoon avatar Jan 24 '24 02:01 qinsoon