apollo-ios icon indicating copy to clipboard operation
apollo-ios copied to clipboard

[v0.49.1] Crash in `RecordSet.merge(record:)`

Open rafalkitta opened this issue 3 years ago • 3 comments
trafficstars

Bug report

We encountered hundreds of following crashes:

Crashed: com.apollographql.ApolloStore
EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x0000000bbd4f7440

Not related to any particular iOS version or device.

Stack trace from crashlytics:

Crashed: com.apollographql.ApolloStore
0  libobjc.A.dylib                0x39d0 objc_retain + 16
1  libswiftCore.dylib             0x39edc4 swift::metadataimpl::ValueWitnesses<swift::metadataimpl::ObjCRetainableBox>::initializeWithCopy(swift::OpaqueValue*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*) + 28
2  libswiftCore.dylib             0x3b3cfc swift::metadataimpl::ValueWitnesses<swift::metadataimpl::OpaqueExistentialBox<0u> >::initializeWithCopy(swift::OpaqueValue*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*) + 128
3  Apollo                         0x1042c outlined init with copy of Any + 44 (<compiler-generated>:44)
4  Apollo                         0x3ef18 RecordSet.merge(record:) + 504 (<compiler-generated>:504)
5  Apollo                         0x3f5fc RecordSet.merge(records:) + 51 (RecordSet.swift:51)
6  Apollo                         0x32dd8 protocol witness for NormalizedCache.merge(records:) in conformance InMemoryNormalizedCache + 21 (InMemoryNormalizedCache.swift:21)
7  Apollo                         0xab70 closure #1 in ApolloStore.publish(records:identifier:callbackQueue:completion:) + 79 (ApolloStore.swift:79)
8  Apollo                         0xa4bc thunk for @escaping @callee_guaranteed () -> () + 20 (<compiler-generated>:20)
9  libdispatch.dylib              0x1e6c _dispatch_call_block_and_release + 32
10 libdispatch.dylib              0x3a30 _dispatch_client_callout + 20
11 libdispatch.dylib              0x140a0 _dispatch_lane_concurrent_drain + 992
12 libdispatch.dylib              0xbcf0 _dispatch_lane_invoke + 504
13 libdispatch.dylib              0x6a24 _dispatch_queue_override_invoke + 496
14 libdispatch.dylib              0x15164 _dispatch_root_queue_drain + 396
15 libdispatch.dylib              0x1596c _dispatch_worker_thread2 + 164
16 libsystem_pthread.dylib        0x1080 _pthread_wqthread + 228
17 libsystem_pthread.dylib        0xe5c start_wqthread + 8

Versions

  • apollo-ios SDK version: 0.49.1
  • Xcode version: 13.4.1
  • Swift version: 5.6.1
  • Package manager: n/a

Steps to reproduce

Reported hundreds times on production, can't reproduce it with debugger even with the same data.

Further details

Crash summary available in Xcode Reports (part of "Organizer" view) pointed to this particular line: Screen Shot 2022-09-26 at 12 07 55

We've found possibly related issue on SO: https://stackoverflow.com/a/55582763 where author indicate that accessing static constant inside the loop was the reason. Here we have calling JSONValueMatcher.equals(lhs:rhs:) from two nested for loops.

See also

Possibly related issues:

  • https://github.com/apollographql/apollo-ios/issues/288
  • https://github.com/apollographql/apollo-ios/issues/1681

rafalkitta avatar Sep 27 '22 13:09 rafalkitta

Hi @rafalkitta 👋🏻 - I've taken a look through the linked issues and nothing obvious is jumping out as a possible cause and those other issues don't seem to have any solid leads either.

Reported hundreds times on production, can't reproduce it with debugger even with the same data

Are you able to run a release build and feed in the same data? Possibly the debug build is shielding/hiding the error.

calvincestari avatar Sep 27 '22 17:09 calvincestari

@calvincestari just tried release build with the same data - no success 🙁 This crash might not be data-specific. What do you think about mentioned solution from SO? Looks like it's harmless change and in case it's compiler optimization issue it should help.

rafalkitta avatar Sep 28 '22 07:09 rafalkitta

Lines 3 and 4 of the crash stack trace show that it was from Apollo but <compiler-generated> code so yes it's possible that some kind of compiler optimization is failing.

We're in the final rush towards GraphQL Summit and the GA release of v1.0; so realistically we won't be able to dig too much into this until end of next week or the week after. Is this something you would have the time to test yourself? Fork the repo, code the suggested change, trial it in TestFlight/Production and then contribute back the change if it resolves the issue?

calvincestari avatar Sep 28 '22 17:09 calvincestari

Unfortunately, lack of time for now 😞I can wait as feature infected by this problem is not turn on on production all the time. Waiting for some good news from you soon 🤞

rafalkitta avatar Oct 03 '22 10:10 rafalkitta

Hi @rafalkitta 👋🏻 please try upgrading to 1.0.7 - a lot has changed since 0.49.1. Thanks!

bignimbus avatar Feb 10 '23 19:02 bignimbus