openj9
openj9 copied to clipboard
Limit the amount of memory used by JITServer AOT cache
Currently there is no upper bound on the amount of memory used by the JITServer AOT cache. Add a configurable limit and stop adding new entries once the limit has been reached.
From what I can tell, the non-JITServer AOT cache keeps track of the the used bytes and a maxAOT value so that getFreeAOTBytes can use them. These are used in SH_CompositeCacheImpl::allocate to determine whether or not a new block of memory can be allocated. If it can't, a header flag to that effect is set and a NULL block is returned. This is all happens after allocateAOT is called in SH_CacheMap::addROMClassResourceToCache.
I think that something similar can be done in the JITServer AOT cache, but I still have to look at the code. It looks like the relevant parts are mostly in JITServerAOTCache.
The JITServer AOT cache is actually a map from AOT cache names to individual AOT cache instances (to segregate the records by client, I think). Should the limit apply to the caches individually, or to the sum of their sizes?
If it's the sum of their sizes, then we can have main JITServerAOTCacheMap cache the total size of the data that it manages. We can then add a test in the AOTCacheRecord::allocate that accesses that variable and checks it against the hard AOT cache limit so that it can fail to allocate if that limit would be exceeded, and otherwise update that limit. That would mean that AOTCacheRecord would need to know about the JITServerAOTCacheMap it's allocating for.
Since nothing is ever deleted from the JITServer AOT caches, we could even have a static _totalRecordBytes in the AOTCacheRecord class that keeps track of this data, perhaps with its own TR::Monitor.
You can already get the number of bytes allocated for all the AOT caches at the JITServer as TR::Compiler->persistentGlobalMemory()->_totalPersistentAllocations[TR_Memory::JITServerAOTCache]. Please note that AOTCacheRecord::allocate() only allocates individual records, but not the JITServerAOTCache instances, PersistentMaps in them, etc. There is also some data stored in ClientSessionData such as _aotCacheKnownIds that is not included in _totalPersistentAllocations[TR_Memory::JITServerAOTCache].
There are 2 options to consider for limiting the memory usage: either the total memory used by all AOT cache instances, or a limit per AOT cache instance. In the latter case, we would need to keep track of individual allocations via AOTCacheRecord::allocate(), as well as some sort of an estimate for additional data such as PersistentMaps.
Also note that this does not include any overhead (e.g. block headers) introduced by the persistent allocator implementation, so this number is a low estimate. For the things allocated with AOTCacheRecord::allocate(), we can keep track of a closer estimate by adding the size of persistent allocator block header to each allocation. I think it's safe to assume that almost all AOT cache allocations are "small" (see the persistent allocator implementation). I think a better solution is to account for the block header overhead in the persistent allocator implementation, but that will probably require some changes to its interface in both OMR and OpenJ9.
I'm not sure if it's worth synchronizing a separate total size counter with a monitor given that it will probably only be an estimate. If anything, atomic increments should have less overhead than a monitor.
Should the limit apply to the caches individually, or to the sum of their sizes?
I think we need to look at the sum of all sizes. The potential user that brought this to our attention was concerned about the JITServer process consuming an unbounded amount of memory. If we don't set the cap globally, it's easy to forever increase memory consumption by using a different cache name for each client.
Since we need to account for all those entries in the maps from JITServerAOTCache, I think that TR::Compiler->persistentGlobalMemory()->_totalPersistentAllocations[TR_Memory::JITServerAOTCache] is probably a better approximation of the total memory consumed for the AOT cache. Yes, it doesn't include the block header overhead, but we can add that later. I believe that the intent is to provide some limit for the possible unbounded memory growth, so if our memory consumption approximation is proportional to the real memory usage, it's still valuable.
I agree that it should be the total memory usage by all caches. That also makes the implementation much easier. Maybe we should also have a limit for the number of caches (with a large default value). For the block header overhead, someone should open an issue so that it doesn't get lost, and the documentation for this feature should state that the memory limit is an approximation that is always lower than the actual memory usage.
Also note that AOT cache records get allocated during AOT compilations at the server, and allocation failures are handled using exceptions rather than propagating NULL return values. Simply failing these allocations once the limit is reached will probably result in AOT compilation failures. Instead, we need to basically make the AOT cache read-only. There is a boolean flag aotCacheStore set somewhere in JITServerCompilationThread which will need to be always set to false once the limit is reached.
There will still be some memory overhead in each client session (the cache of AOT cache IDs known by the client) even if the cache is read-only, but that memory is freed once the client disconnects, so it can't grow indefinitely, and I think we can ignore it.
That makes sense, as a first pass at the problem. Is setting _aotCacheStore to false the only thing that needs to be done? If so, then we'd only have to add a test of the total allocations to the line
_aotCacheStore = classChain && aotCache;
where it's currently set. That, and add a new command line option, of course.
Is setting
_aotCacheStoreto false the only thing that needs to be done?
After looking at the code, that's not actually enough since AOT cache records can still be created during loads because they are part of the key used to lookup cached methods, namely the defining class chain record (which also refers to class records and class loader records) and the AOT header record. We can add a boolean argument to the JITServerAOTCache::get*Record() methods to control whether they can create new records or only return existing ones (when in read-only mode). We might also need to add some null checks for the results of these functions.