PromiseKit icon indicating copy to clipboard operation
PromiseKit copied to clipboard

EmptyBox.seal crash

Open stachon opened this issue 5 years ago • 21 comments

We see this crash in our app on AppStore, using PromiseKit 6.2.1:

Thread 5 Crashed:
0   libsystem_malloc.dylib        	0x000000018763355c nanov2_allocate_from_block$VARIANT$mp + 4 (nanov2_malloc.c:1942)
1   libsystem_malloc.dylib        	0x00000001876329d0 nanov2_allocate$VARIANT$mp + 140 (nanov2_malloc.c:2395)
2   libsystem_malloc.dylib        	0x00000001876328f4 nanov2_malloc$VARIANT$mp + 60 (nanov2_malloc.c:922)
3   libsystem_malloc.dylib        	0x00000001876419a0 malloc_zone_malloc + 156 (malloc.c:1454)
4   libsystem_malloc.dylib        	0x00000001876423b0 malloc + 32 (malloc.c:1720)
5   libswiftCore.dylib            	0x0000000104931f80 0x104684000 + 2809728
6   libswiftCore.dylib            	0x0000000104932000 0x104684000 + 2809856
7   libswiftDispatch.dylib        	0x0000000104b9bcac 0x104b88000 + 81068
8   libswiftDispatch.dylib        	0x0000000104b973e0 0x104b88000 + 62432
9   libswiftDispatch.dylib        	0x0000000104b97d74 0x104b88000 + 64884
10  PromiseKit                    	0x00000001031daf14 EmptyBox.seal(_:) + 188 (Box.swift:38)
11  PromiseKit                    	0x00000001031dc9c0 partial apply + 24 (<compiler-generated>:0)
12  PromiseKit                    	0x00000001031e1c18 thunk for @escaping @callee_guaranteed (@in_guaranteed A) -> () + 20 (<compiler-generated>:0)
13  PromiseKit                    	0x00000001031dc42c partial apply for thunk for @escaping @callee_guaranteed (@in_guaranteed A) -> (@out ()) + 20 (<compiler-generated>:0)
14  PromiseKit                    	0x00000001031dc348 partial apply for closure #2 in EmptyBox.seal(_:) + 28 (Box.swift:51)
15  PromiseKit                    	0x00000001031dc3cc partial apply for thunk for @callee_guaranteed (@guaranteed @escaping @callee_guaranteed (@in_gua... + 104 (<compiler-generated>:0)
16  libswiftCore.dylib            	0x000000010468fc14 0x104684000 + 48148
17  PromiseKit                    	0x00000001031dafdc EmptyBox.seal(_:) + 388 (Box.swift:51)
18  PromiseKit                    	0x00000001031dc9c0 partial apply + 24 (<compiler-generated>:0)
19  PromiseKit                    	0x00000001031e1c18 thunk for @escaping @callee_guaranteed (@in_guaranteed A) -> () + 20 (<compiler-generated>:0)
20  PromiseKit                    	0x00000001031dc42c partial apply for thunk for @escaping @callee_guaranteed (@in_guaranteed A) -> (@out ()) + 20 (<compiler-generated>:0)
21  PromiseKit                    	0x00000001031dc348 partial apply for closure #2 in EmptyBox.seal(_:) + 28 (Box.swift:51)
22  PromiseKit                    	0x00000001031dc3cc partial apply for thunk for @callee_guaranteed (@guaranteed @escaping @callee_guaranteed (@in_gua... + 104 (<compiler-generated>:0)
23  libswiftCore.dylib            	0x000000010468fc14 0x104684000 + 48148
24  PromiseKit                    	0x00000001031dafdc EmptyBox.seal(_:) + 388 (Box.swift:51)
25  PromiseKit                    	0x00000001031dc9c0 partial apply + 24 (<compiler-generated>:0)
26  PromiseKit                    	0x00000001031e1c18 thunk for @escaping @callee_guaranteed (@in_guaranteed A) -> () + 20 (<compiler-generated>:0)
27  PromiseKit                    	0x00000001031dc42c partial apply for thunk for @escaping @callee_guaranteed (@in_guaranteed A) -> (@out ()) + 20 (<compiler-generated>:0)
28  PromiseKit                    	0x00000001031dc348 partial apply for closure #2 in EmptyBox.seal(_:) + 28 (Box.swift:51)
29  PromiseKit                    	0x00000001031dc3cc partial apply for thunk for @callee_guaranteed (@guaranteed @escaping @callee_guaranteed (@in_gua... + 104 (<compiler-generated>:0)
30  libswiftCore.dylib            	0x000000010468fc14 0x104684000 + 48148

EmptyBox.seal is repeated many times in the stack trace, so it is probably an infinite recursion. Unfortunately the stack tace in the crash log is not deep enough to include the original source. It might be an incorrect usahe of PromiseKit on our side, is there some pattern to look for? Or is it a PromiseKit bug that might have been already solved in a newer version? Thanks.

stachon avatar Apr 10 '19 08:04 stachon

Not seen this before. seal does not call itself and is constructed so it should not call any handlers (eg. then handlers) until the “seal” is finalized, and subsequence seal calls are no-ops. So I have no immediate ideas.

mxcl avatar Apr 10 '19 21:04 mxcl

I'm having a similar issue, here's how the cycle begins in my case:

Catchable.swift:

                on.async(flags: flags) {
                    body(error) // << line 33
                    finalizer.pending.resolve(())
                }

MyClass.swift:

            return Promise { seal in
                firstly {
                    Movie.make(path: path) // this returns a `Promise<Movie?>`
                }.done { movie in
                    // some code that uses movie, but doesn't use `seal`
                    seal.fulfill(())
                }.catch { error in
                    seal.reject(error) // << this line is causing the crash
                }
            }

Resolver.swift:

    /// Rejects the promise with the provided error
    func reject(_ error: Error) {
        box.seal(.rejected(error)) // << line 24
    }

Box.swift:

        if let handlers = handlers {
            handlers.bodies.forEach{ $0(value) } // << line 51
        }

Then, the trace shows Sequence.forEach(_:) with assembler code. Next, the same as above and this is actually where the infinite recursion starts:

[BEGIN OF RECURSION CODE]

Box.swift:

        if let handlers = handlers {
            handlers.bodies.forEach{ $0(value) } // << line 51
        }

Then, again the trace shows Sequence.forEach(_:) with assembler code.

Box.swift:

        if let handlers = handlers {
            handlers.bodies.forEach{ $0(value) } // << line 51
        }

Thenable.swift:

            case .rejected(let error):
                rp.box.seal(.rejected(error)) // << line 48
            }

[GO TO BEGIN OF RECURSION CODE]

Maybe this info helps giving advice on how to fix it or fix the issue? I'm running version 6.8.4 on tvOS 12.2 within Xcode 10.2 (Swift 5).

Jeehut avatar Apr 21 '19 10:04 Jeehut

Note that I could fix the issue, by setting multiple breakpoints in my Movie.make(path: path) method and investigated what caused the error in the first place. Then, once I fixed it and the seal.reject(error) in the MyClass.swift wasn't called anymore (instead seal.fulfill(()) being called), the issue didn't appear any more. But that of course isn't a fix to this issue, it's merely preventing my code from failing in the first place. But once it fails, I'll have a crash again without possibilities of handling the error within the app properly.

Jeehut avatar Apr 21 '19 10:04 Jeehut

The code is here for this area:

https://github.com/mxcl/PromiseKit/blob/master/Sources/Box.swift#L36-L57

Hard to see how it could recurse, the barrier synchronization prevents re-entry and sets the state of the Box to sealed so even if the handler caused itself to add another handler it would do the no-op at the top and return. And the barrier would prevent it from recursing anyway.

/cc @GarthSnyder @dougzilla32

mxcl avatar Apr 21 '19 13:04 mxcl

I looked through this too; I don't see how it's possible either. It seems pretty clear that EmptyBox.sealant isn't written outside a barrier and that handlers are never executed until the sealant has been set to .resolved.

@Dschee, this is a very nicely documented exploration of the loop, but I'm guessing it would require a live test case to really understand. This is a lot to ask, but is there any chance you could upload a minified example that reproduces the problem?

One thing I notice is that Thenable.swift:48 is part of the implementation of then, and I don't see any thens in the code outline above.

I made a little test case project that adds a then to the Movie dummy, but it doesn't reproduce the issue on its own.

TestCaseTV.zip

GarthSnyder avatar Apr 22 '19 05:04 GarthSnyder

I think it might not be related to an infinite loop, but perhaps it's related to a long chain of handlers when the error occurs? In the attached example, you can change guard page < 1 else { in ViewController.swift:extractItemsInPage(page:context:):L111 to guard page < 4000 else { and you will see the crash disappear:

PromiseKitCrash.zip

katzenbaer avatar Sep 18 '19 00:09 katzenbaer

I'm seeing the same crash on an app currently on beta on Test Flight. I was also able to reproduce the crash on @katzenbaer's example.

While I've only received one crash report of this crash, I'm worried it might be a more widespread issue once we get more users on board.

savvyEmilio avatar Jan 27 '20 20:01 savvyEmilio

I'm seeing this crash when setting up a synchronous data chunk reading pattern similar to the Chaining Sequences example in the Common Patterns doc. Every time I try to cancel, I run into an "EXC_BAD_ACCESS" crash on EmptyBox.seal.

Here's the setup:

func execute(files: [Data]) -> Promise<Void> {
    var promise = Promise()
    for file in files {
        promise = promise.then {
            self.writeFile(file)
        }
    }

    return promise.then {
        ...something else...
    }
}

func writeFile(_ file: Data) -> Promise<Void> {
    let size = file.count
    var promise = Promise()
    var startIndex = 0

    while startIndex < size {
        let endIndex = min(startIndex + BUFFER_SIZE, size)
        let data = file.subdata(in: startIndex..<endIndex)
        startIndex = endIndex

        promise = promise.then { () -> Promise<Void> in
            if self.abort {   // Bool that gets set to true if user hits a cancel button
                return Promise(error: PMKError.cancelled)
            } else {
                return somethingElse.write(data)
            }
        }
    }

    return promise
}

cfchristian2 avatar Mar 03 '20 17:03 cfchristian2

Our cancel code is a bit of ObjC trickery that potentially is a bug in Swift.

Only way we can fix this. is with a testcase unfortunately, anyone vested enough to provide one will be rewarded by the fix.

mxcl avatar Mar 04 '20 14:03 mxcl

Here's a testcase, hope it helps :) Running on an iPhone XS Max iOS 13.4.1

import UIKit
import PromiseKit

class TestViewController: UIViewController {
    
    enum Example: Error {
        case Error
    }
    
    /*
     Common pattern used from the PromiseKit repo
     https://github.com/mxcl/PromiseKit/blob/master/Documentation/CommonPatterns.md#retry--polling
    */
    
    func attempt<T>(maximumRetryCount: Int = 3, delayBeforeRetry: DispatchTimeInterval = .seconds(2), _ body: @escaping () -> Promise<T>) -> Promise<T> {
        var attempts = 0
        func attempt() -> Promise<T> {
            attempts += 1
            print(attempts)
            return body().recover { error -> Promise<T> in
                guard attempts < maximumRetryCount else { print("Stopped"); throw error }
                return after(delayBeforeRetry).then(on: nil, attempt)
            }
        }
        return attempt()
    }
    
    private func exampleFunction() -> Promise<Void> {
        return Promise { seal in
            throw Example.Error
        }
    }
    
    override func viewDidLoad() {
        super.viewDidLoad()
        
        attempt(maximumRetryCount: 500, delayBeforeRetry: .milliseconds(1)) {
            self.exampleFunction()
        }.done { _ in
            print("DONE")
        }.catch { error in
            print("Error", error)
        }
    }
}

elapovski avatar Apr 29 '20 12:04 elapovski

I can confirm that @elapovski's test case raises the same issue as mentioned above. While 500 being enough on an actual phone, I have to raise it a bit (tried it with 4000) on my computer.

Also +1 for the issue in general. We have a rare crash here as well.

flohei avatar May 07 '20 12:05 flohei

I found this bug in Swift, not sure if it could be related: https://bugs.swift.org/browse/SR-12294

stachon avatar May 07 '20 12:05 stachon

For reference, we are seeing a similar crash...

Crashed: NSOperationQueue 0x104b14560 (QOS: UNSPECIFIED)
0   libsystem_malloc.dylib         0x19cd4bb1c nanov2_allocate_from_block$VARIANT$mp + 4
1   libsystem_malloc.dylib         0x19cd4aed0 nanov2_malloc$VARIANT$mp + 60
2   libsystem_malloc.dylib         0x19cd59754 malloc_zone_malloc + 156
3   libsystem_malloc.dylib         0x19cd5a164 malloc + 28
4   libswiftCore.dylib             0x1aa5aa138 swift_slowAlloc + 32
5   libswiftCore.dylib             0x1aa5aa1a8 _swift_allocObject_(swift::TargetHeapMetadata<swift::InProcess> const*, unsigned long, unsigned long) + 28
6   libswiftDispatch.dylib         0x1d171e878 partial apply for OS_dispatch_queue._syncBarrier(block:) + 52
7   libswiftDispatch.dylib         0x1d17188b4 OS_dispatch_queue._syncHelper<A>(fn:execute:rescue:) + 228
8   libswiftDispatch.dylib         0x1d17191e0 OS_dispatch_queue.sync<A>(flags:execute:) + 432
9   PromiseKit                     0x102812c8c $s10PromiseKit8EmptyBoxC4sealyyxF + 180
10  PromiseKit                     0x102830004 $s10PromiseKit8ThenablePAAE4then2on5flags_AA0A0Cy1TQyd__GSo17OS_dispatch_queueCSg_8Dispatch0J13WorkItemFlagsVSgqd__AIQzKctAaBRd__lFyAA6ResultOyASGcfU_ + 468
11  PromiseKit                     0x10283624c $s10PromiseKit8ThenablePAAE4then2on5flags_AA0A0Cy1TQyd__GSo17OS_dispatch_queueCSg_8Dispatch0J13WorkItemFlagsVSgqd__AIQzKctAaBRd__lFyAA6ResultOyASGcfU_TA + 40
12  PromiseKit                     0x10281c634 $sxIegn_xytIegnr_lTR + 20
13  PromiseKit                     0x102814368 $sxytIegnr_xIegn_lTRTA + 20
14  PromiseKit                     0x102814300 $s10PromiseKit8EmptyBoxC4sealyyxFyyxcXEfU0_TA + 48
15  PromiseKit                     0x102812fc8 $sxIegn_s5Error_pIggzo_xytIegnr_sAA_pIegnzo_lTR + 108
16  PromiseKit                     0x102814328 $sxIegn_s5Error_pIggzo_xytIegnr_sAA_pIegnzo_lTRTA + 20
17  libswiftCore.dylib             0x1aa454418 Sequence.forEach(_:) + 448
18  PromiseKit                     0x102812d58 $s10PromiseKit8EmptyBoxC4sealyyxF + 384
19  PromiseKit                     0x102830004 $s10PromiseKit8ThenablePAAE4then2on5flags_AA0A0Cy1TQyd__GSo17OS_dispatch_queueCSg_8Dispatch0J13WorkItemFlagsVSgqd__AIQzKctAaBRd__lFyAA6ResultOyASGcfU_ + 468
20  PromiseKit                     0x10283624c $s10PromiseKit8ThenablePAAE4then2on5flags_AA0A0Cy1TQyd__GSo17OS_dispatch_queueCSg_8Dispatch0J13WorkItemFlagsVSgqd__AIQzKctAaBRd__lFyAA6ResultOyASGcfU_TA + 40
21  PromiseKit                     0x10281c634 $sxIegn_xytIegnr_lTR + 20
22  PromiseKit                     0x102814368 $sxytIegnr_xIegn_lTRTA + 20
23  PromiseKit                     0x102814300 $s10PromiseKit8EmptyBoxC4sealyyxFyyxcXEfU0_TA + 48
24  PromiseKit                     0x102812fc8 $sxIegn_s5Error_pIggzo_xytIegnr_sAA_pIegnzo_lTR + 108
25  PromiseKit                     0x102814328 $sxIegn_s5Error_pIggzo_xytIegnr_sAA_pIegnzo_lTRTA + 20
26  libswiftCore.dylib             0x1aa454418 Sequence.forEach(_:) + 448
27  PromiseKit                     0x102812d58 $s10PromiseKit8EmptyBoxC4sealyyxF + 384
28  PromiseKit                     0x102830004 $s10PromiseKit8ThenablePAAE4then2on5flags_AA0A0Cy1TQyd__GSo17OS_dispatch_queueCSg_8Dispatch0J13WorkItemFlagsVSgqd__AIQzKctAaBRd__lFyAA6ResultOyASGcfU_ + 468
29  PromiseKit                     0x10283624c $s10PromiseKit8ThenablePAAE4then2on5flags_AA0A0Cy1TQyd__GSo17OS_dispatch_queueCSg_8Dispatch0J13WorkItemFlagsVSgqd__AIQzKctAaBRd__lFyAA6ResultOyASGcfU_TA + 40
30  PromiseKit                     0x10281c634 $sxIegn_xytIegnr_lTR + 20

Once again, it seems EmptyBox.seal is repeated many times. I understand this may not be a problem with PromiseKit necessarily, but any help in pinpointing the issue would be most appreciated.

MatthewWakefield-PCC avatar May 12 '20 15:05 MatthewWakefield-PCC

We are facing the same crash on Xcode 12 beta 5 and it's occurring more often.

image

mohammad19991 avatar Aug 26 '20 08:08 mohammad19991

I just re-examined the code in question and genuinely cannot see a valid concurrency issue that would indicate this is PromiseKit’s fault. The data structures are all manipulated atomically, even for foreseeable “then called while promise is being sealed scenarios”.

Edit: OTOH the consistency of reports here either means Swift has been miscompiling this code for a while, or it’s us IMO.

mxcl avatar Aug 31 '20 13:08 mxcl

At this point one of you people needs to step up, report the bug, give the Swift team a test case and see what they say, there is no point in waiting on me, I have too much to do I am afraid and haven’t seen this bug personally.

mxcl avatar Aug 31 '20 13:08 mxcl

At this point one of you people needs to step up, report the bug, give the Swift team a test case and see what they say, there is no point in waiting on me, I have too much to do I am afraid and haven’t seen this bug personally.

I can open a ticket w Swift team, but are you able to reproduce in the example I uploaded? I think others have vouched that my example project reproduces for them.

katzenbaer avatar Aug 31 '20 13:08 katzenbaer

@stachon @mxcl Hi, guys! Is there any update on this issue? Have you found any fixes?

Blyabtroi avatar Aug 13 '21 07:08 Blyabtroi

@Blyabtroi I am not using PromiseKit anymore so I can't add anything to this issue.

stachon avatar Aug 13 '21 07:08 stachon

k, I figured it out, it’s a stackoverflow.

Here’s an example of it happening with Swift NIO: https://github.com/apple/swift-nio/issues/970

It only happens for situations where you are chaining promises > ~4,000 times (depending on the function stack size of the runtime, so it varies for Mac vs. iOS).

A way to mitigate would be to stick a DispatchQueue.main.async{ } in the handler that recurses.

Making it so PromiseKit itself doesn't do this would require some kind of counter where we then do this manually ourselves to break the stack every 500 or so recursions. I’m not sure this is possible since Apple don’t let us know what GCD queue we are on with public API so I’m not sure we can “count”.

This may well be a caveat to PromiseKit and indeed futures/promises in general with Swift.

I used this code to trigger it (add to ThenableTests.swift):

extension ThenableTests {
    func testReproduceStackoverflow() {

        let ex = expectation(description: "")

        func attempt<T>(maximumRetryCount: Int = 3, delayBeforeRetry: DispatchTimeInterval = .seconds(2), _ body: @escaping () -> Promise<T>) -> Promise<T> {
            var attempts = 0
            func attempt() -> Promise<T> {
                attempts += 1
                return body().recover { error -> Promise<T> in
                    guard attempts < maximumRetryCount else { throw error }
                    return after(delayBeforeRetry).then(on: .main, attempt)
                }
            }
            return attempt()
        }


        enum Error: Swift.Error {
            case dummy
        }

        func exampleFunction() -> Promise<Void> {
            return Promise { seal in
                throw Error.dummy
            }
        }

        attempt(maximumRetryCount: 6_000, delayBeforeRetry: .milliseconds(1)) {
            exampleFunction()
        }.catch { error in
            XCTFail("\(error)")
        }.finally(ex.fulfill)

        waitForExpectations(timeout: 600)
    }
}

The true fix is async/await since Apple hand coded the runtime and it doesn’t recurse, it loops the event loops very efficiently.

mxcl avatar Aug 16 '21 14:08 mxcl

So basically: not a bug (we're running up against the limitations of the platforms we run upon).

But we should try to not trigger it if we can figure out how.

Or at the very least document this behavior.

The final solution being: dispatch to some queue (our own?) everywhere this can happen (when we pipe straight to seal basically). Maybe that is ok, it's low performance, especially in cases like this where you have 4,000 nested closures, but PromiseKit doesn’t claim to chase high performance.

mxcl avatar Aug 16 '21 14:08 mxcl