hermes icon indicating copy to clipboard operation
hermes copied to clipboard

iOS Crashing: Max heap size was exceeded

Open jfries289 opened this issue 1 year ago • 3 comments

Bug Description

Since switching to Hermes, one of our customers is reporting continual crashing 45-60 seconds after opening the app.

Sentry is reporting the following:

EXC_BAD_ACCESS RNSScreen

EXC_BAD_ACCESS
, message =  > [] reason = Max heap size was exceeded (1 from category: vm_allocate_category), numCollections = 1546, heapSize = 910163968, allocated = 888660456, va = 910163968, external = 2560326059 > lue =  > natural > OOM: error_code(value = 1, category = vm_allocate_category, message = Max heap size was exceeded) > y = vm_allocate_category, message = Max heap size was exceeded) > ZTUM >
Attempted to dereference null pointer.

Trace looks like:

hermes +0x03aba8   facebook::jsi::JSError::~JSError
hermes +0x03a368   facebook::jsi::JSError::~JSError
hermes +0x021718   facebook::jsi::JSError::~JSError
hermes +0x02104c   facebook::jsi::JSError::~JSError
hermes +0x00b948   facebook::hermes::HermesRuntime::rootsListLength
inspectflowplus +0x22a6c0   facebook::jsi::RuntimeDecorator<T>::call (decorator.h:297) In App
inspectflowplus +0x22a6c0   facebook::jsi::WithRuntimeDecorator<T>::call (decorator.h:700) In App
inspectflowplus +0x25bd7c   facebook::jsi::Function::call (jsi-inl.h:228) In App
inspectflowplus +0x25bd7c   facebook::jsi::Function::call (jsi-inl.h:233) In App
inspectflowplus +0x25bd7c   facebook::jsi::Function::call<T> (jsi-inl.h:241) In App
inspectflowplus +0x25bbe0   facebook::react::JSIExecutor::callFunction::lambda::operator() (JSIExecutor.cpp:256)
inspectflowplus +0x25bbe0   std::__1::__invoke<T> (type_traits:3918)
inspectflowplus +0x25bbe0   std::__1::__invoke_void_return_wrapper<T>::__call<T> (invoke.h:61)
inspectflowplus +0x25bbe0   std::__1::__function::__alloc_func<T>::operator() (function.h:178)
inspectflowplus +0x25bbe0   std::__1::__function::__func<T>::operator() (function.h:352)
inspectflowplus +0x15e83c   std::__1::__invoke<T> (type_traits:3918)
inspectflowplus +0x15e83c   std::__1::__invoke_void_return_wrapper<T>::__call<T> (invoke.h:61)
inspectflowplus +0x259150   std::__1::__function::__value_func<T>::operator() (function.h:505)
inspectflowplus +0x259150   std::__1::function<T>::operator() (function.h:1182)
inspectflowplus +0x259150   facebook::react::JSIExecutor::callFunction (JSIExecutor.cpp:254)
inspectflowplus +0x203128   std::__1::__function::__value_func<T>::operator() (function.h:505)
inspectflowplus +0x203128   std::__1::function<T>::operator() (function.h:1182)
inspectflowplus +0x203128   facebook::react::NativeToJsBridge::runOnExecutorQueue::lambda::operator() (NativeToJsBridge.cpp:310)
inspectflowplus +0x203128   std::__1::__invoke<T> (type_traits:3918)
inspectflowplus +0x203128   std::__1::__invoke_void_return_wrapper<T>::__call<T> (invoke.h:61)
inspectflowplus +0x203128   std::__1::__function::__alloc_func<T>::operator() (function.h:178)
inspectflowplus +0x203128   std::__1::__function::__func<T>::operator() (function.h:352)
inspectflowplus +0x161748   std::__1::__function::__value_func<T>::operator() (function.h:505)
inspectflowplus +0x161748   std::__1::function<T>::operator() (function.h:1182)
inspectflowplus +0x161748   facebook::react::tryAndReturnError (RCTCxxUtils.mm:74)
inspectflowplus +0x16c88c   facebook::react::RCTMessageThread::tryFunc (RCTMessageThread.mm:69)
inspectflowplus +0x16c640   std::__1::__function::__value_func<T>::operator() (function.h:505)
inspectflowplus +0x16c640   std::__1::function<T>::operator() (function.h:1182)
inspectflowplus +0x16c640   facebook::react::RCTMessageThread::runAsync (RCTMessageThread.mm:45)

There are 19 similar Sentry reports from the last few days from that user, each with a different heapSize. Could definitely use some help.

  • [ ] I have run gradle clean and confirmed this bug does not occur with JSC

Hermes version: 0.11.0 React Native version (if any): 0.68.1 OS version (if any): iOS 17 (though it was happening with 16.6.1 as well) Device: iPad 13.10

Steps To Reproduce

  1. Haven't been able to reproduce locally

The Expected Behavior

Should not throw Max Heap size error when heap size is much less than 3 GB.

jfries289 avatar Sep 22 '23 00:09 jfries289

According to the error message the total allocated memory seems to be 910163968 + 2560326059 = 3.2GB, which is more than 3GB. So that could explain why there is an error. If this happens so quickly, it looks like a very serious memory leak.

I will consult with other on the team to see what they think.

tmikov avatar Sep 22 '23 02:09 tmikov

(BTW, I noticed that this is RN 0.68, which is a version of Hermes that is over an year old, making it that much harder to figure out what is going on. We are working on adding an ABI, which will make it easy to upgrade Hermes to the latest version, but until then, unfortunately our ability to debug older versions of Hermes is limited)

tmikov avatar Sep 22 '23 02:09 tmikov

Looking at the reported error, the external memory consumption in this heap is relatively high, likely due to a lot of string or ArrayBuffer allocations.

While it is possible that this is occurring due to a leak somewhere in the application, there is also a known bug in the GC heuristics of the version of Hermes you are using in cases where the external memory consumption is high (as is the case here). That bug was fixed in 82d358c760c1192e8cce78f57791d2f4e343da44 and is present in RN 0.69 and later.

Is it possible for you to try a newer version of RN and see if the issue persists? If not, it is possible that a well placed call to gc() in your code can mitigate the issue by forcing a collection. The bug only affects when the GC chooses to run, and not what it collects, so forcing a collection should bring the memory consumption back in line.

EDIT: There is also some information that would help us understand this issue better.

  1. Reporting the result of regular calls to HermesRuntime.getInstrumentedStats() would prove how the heap is growing.
  2. Even better than (1) would be if you can hook into the AnalyticsCallback option of the Hermes RuntimeConfig (this may require modifying some RN source). That would give us information about GC activity and confirm if this is the issue I described above or something else.

neildhar avatar Sep 22 '23 03:09 neildhar