graal icon indicating copy to clipboard operation
graal copied to clipboard

[GR-51851] Add initial support for native memory tracking (NMT mallocs)

Open roberttoyonaga opened this issue 2 years ago • 7 comments

Summary

As a first step to supporting native memory tracking (NMT), this PR adds support for tracking mallocs/calloc/realloc. Virtual memory tracking support will be added in a separate PR as it is mostly independent of this code.

Related issue: https://github.com/oracle/graal/issues/7631 This already includes the cleanup from: https://github.com/oracle/graal/pull/7669

Implementation

This PR adopts a similar approach to NMT in Hotspot. It uses "malloc headers". A "malloc header" is used to cache metadata about the native allocation. To do this, a small amount of additional space is requested contiguous to the user allocation. Malloc headers are always the same size. This metadata is later used to update the memory tracking once the block is freed. The usage of malloc headers requires certain precautions. Please see the javadoc on class NativeMemoryTracking for more info.

Alternative Approaches

One alternative approach to malloc headers is using a big look-up table to cache metadata. This approach has the advantage of being more robust. It does not have the disadvantages of malloc headers described in theNativeMemoryTracking javadoc. However, it has a few big disadvantages of its own:

  1. Work would need to be done to tune the table size
  2. Access to the table would need to be synchronized. The benefit of malloc headers, is that no locking is required after initialization.

An alternative to enabling NMT at runtime is to decide whether to enable NMT at build-time. This would allow us to bypass a lot of complexity by eliminating the special handling done to accommodate a pre-init phase. This is a possibility not available in Hotspot. In this PR, I've decided to implement the handling required to postpone this decision until runtime, not because I firmly believe it's the best choice, but because it might help facilitate discussion on whether its worth the added flexibility. The code that handles pre-init is relatively independent and can be easily removed. The advantages of enabling at runtime are:

  1. Parity with Hotspot
  2. More flexibility for the user.

The disadvantages are:

  1. Additional code complexity
  2. Allocations in the pre-init phase that live across the initialization boundary cannot be freed. This prevents their addresses from being reused. This may not be a big deal because the number of pre-init allocations that survive beyond initialization and would normally be freed shortly afterward is likely small.

Limitations

  • NMT has not been hooked up to JFR. Support for NMT JFR events are not yet added. This shouldn't be too hard to do, but I've left it out to avoid this PR getting too big. For now, a single snapshot of the data is logged at shutdown. This isn't very useful because you only get one datapoint.
  • Support is only for Linux. It should be possible to add Windows support later. Similar tweaks will be needed in the windows platform specific code.
  • Only a few native allocation call sites have been labelled (mtTest, mtNMT, mtTracing, mtThread, mtOther). Everything else is aggregated into the default mtNone for now. Manual labeling of other call sites can be done in future PRs.

roberttoyonaga avatar Nov 23 '23 20:11 roberttoyonaga

Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA). The following contributors of this PR have not signed the OCA:

To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application.

When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated.

If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public.

@fniephaus do you know why the OCA check is failing for me now?

roberttoyonaga avatar Nov 23 '23 21:11 roberttoyonaga

@roberttoyonaga sorry, there has been an issue on our side. It's now resolved, the check is passing. thank you for your contribution!

alina-yur avatar Nov 24 '23 10:11 alina-yur

@roberttoyonaga : thanks, I started integrating this PR.

christianhaeubl avatar Feb 06 '24 12:02 christianhaeubl

@roberttoyonaga : thanks, I started integrating this PR.

Thank you @christianhaeubl. What do you think about allowing enabling NMT at runtime vs buildtime?

Also, I think it might be a good idea to pad the malloc header with 4B so it's a total of 16B for alignment reasons. I haven't done this yet, but should be easy and not have any side effects other than increase memory usage.

roberttoyonaga avatar Feb 06 '24 13:02 roberttoyonaga

I think that we should only have a hosted option for NMT (i.e., if the option is enable at image build-time, NMT is used at run-time).

As far as I can see, the malloc header is already 16 bytes (is rounded to wordSize).

christianhaeubl avatar Feb 06 '24 14:02 christianhaeubl

I think that we should only have a hosted option for NMT (i.e., if the option is enable at image build-time, NMT is used at run-time).

Ok, in that case, we can remove much of the code in NativeMemoryTracking.java

As far as I can see, the malloc header is already 16 bytes (is rounded to wordSize).

Right, I forgot about that

roberttoyonaga avatar Feb 06 '24 14:02 roberttoyonaga