mlx-swift-examples
mlx-swift-examples copied to clipboard
llm-tool: crash during async generation
Steps to reproduce:
- Move LLMTool.swift
@main
annotation fromSyncGenerator
toAsyncGenerator
. - Run llm-tool target.
Result: Thread 4: EXC_BAD_ACCESS (code=2, address=0x16ff1bfe0)
Call stack:
#0 0x00000001050d4ac0 in mlx::core::eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&)::$_3::operator()(mlx::core::array const&, bool) const at /Users/zach/Library/Developer/Xcode/DerivedData/mlx-swift-examples-ffvauqjflgpefffxjxtewhqyjymm/SourcePackages/checkouts/mlx-swift/Source/Cmlx/mlx/mlx/transforms.cpp:55
#1 0x00000001050d4ab0 in decltype(std::declval<mlx::core::eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&)::$_3&>()(std::declval<mlx::core::array const&>(), std::declval<bool>())) std::__1::__invoke[abi:v160006]<mlx::core::eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&)::$_3&, mlx::core::array const&, bool>(mlx::core::eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&)::$_3&, mlx::core::array const&, bool&&) at /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__functional/invoke.h:394
#2 0x00000001050d4a50 in void std::__1::__invoke_void_return_wrapper<void, true>::__call<mlx::core::eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&)::$_3&, mlx::core::array const&, bool>(mlx::core::eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&)::$_3&, mlx::core::array const&, bool&&) at /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__functional/invoke.h:487
(... snip 5000 frames of mlx::core::eval
recursion...)
#5320 0x00000001050c8d9c in mlx::core::eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&) at /Users/zach/Library/Developer/Xcode/DerivedData/mlx-swift-examples-ffvauqjflgpefffxjxtewhqyjymm/SourcePackages/checkouts/mlx-swift/Source/Cmlx/mlx/mlx/transforms.cpp:119
#5321 0x0000000104537828 in ::mlx_eval(mlx_vector_array) at /Users/zach/Library/Developer/Xcode/DerivedData/mlx-swift-examples-ffvauqjflgpefffxjxtewhqyjymm/SourcePackages/checkouts/mlx-swift/Source/Cmlx/include/mlx/c/transforms.cpp:20
#5322 0x0000000105199c80 in eval(_:) at /Users/zach/Library/Developer/Xcode/DerivedData/mlx-swift-examples-ffvauqjflgpefffxjxtewhqyjymm/SourcePackages/checkouts/mlx-swift/Source/MLX/Transforms+Eval.swift:12
#5323 0x0000000104494928 in closure #1 in generate(prompt:model:temp:) at /Users/zach/Documents/Development/GitHub/mlx/mlx-swift-examples/Libraries/LLM/Util.swift:103
#5324 0x0000000104494f14 in partial apply for closure #1 in generate(prompt:model:temp:) ()
#5325 0x0000000104495910 in thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out τ_0_0) ()
#5326 0x0000000104495a6c in partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out τ_0_0) ()
This gets really deep call stacks when compiled Debug. These are mostly tail calls and if you compile Release this will work. I will document this.
Thanks for the quick reply! I wondered if that was what was happening, but after trying a couple of things, I got the impression the stack size can't really be controlled with Swift concurrency (unless/until more executor stuff lands, maybe?). It wasn't immediately clear to me how to e.g. link an optimized build of the dependency into an otherwise debug build of my app, either. I don't have a ton of Swift experience, so grateful for any suggestions!
Yeah, I don't know that the stack size can be controlled either. Another option (maybe) is to put @MainActor
on it. This would work fine for a command line tool like this, but wouldn't work for an application, e.g. if you wanted to run this on iOS. This might work for you, depending on what you are doing.
As for how to do that when you have a Debug build, one thing that I tried much earlier was to add -O3
in mlx-swift
's Package.swift
.
For example around line 87:
cxxSettings: [
.headerSearchPath("mlx"),
.headerSearchPath("include/mlx-c"),
You can add something like .unsafeOptions(["-O3"]),
. This isn't suitable for putting in the Package.swift
in the repo as it makes it unbuildable in most situations (limitations around the unsafe options) but I think it ought to work for development.
Makes sense. Yeah, unfortunately, I was trying to wrap it with SwiftUI. I'll poke around at it more this evening after work. Thanks again!
I added some notes in the README: 8870b0d3868a4188b5b160c374ecfb12fc0a18bc
Thanks. Sadly, I tried .unsafeFlags(["-O3"])
in both cSettings and cxxSettings for the Cmlx target without any luck.
I ended up kludging together a variation of generate()
that's a cheesy mishmash of NSThread + async; I'm not even sure this semaphore backpressure hack will actually work the way I intended, but FWIW:
public class LLMGeneratorThread : Thread {
public let buffer: AsyncBufferSequence<AsyncChannel<Int>>
let channel = AsyncChannel<Int>()
let prompt: MLXArray
let model: LLMModel
let sem = DispatchSemaphore(value: 0)
public init(prompt: MLXArray, model: LLMModel) {
self.prompt = prompt
self.model = model
self.buffer = channel.buffer(policy: .bounded(10))
super.init()
self.stackSize = 8 * 1024 * 1024
}
public override func main() {
var y = prompt
var cache = [(MLXArray, MLXArray)]()
while !isCancelled {
var logits: MLXArray
(logits, cache) = model(
expandedDimensions(y, axis: 0), cache: cache.isEmpty ? nil : cache)
y = sample(logits: logits[-1, axis: 1], temp: 1.0)
eval(y)
let token = y.item(Int.self)
Task {
await channel.send(token)
sem.signal()
}
sem.wait()
}
}
}
Yeah, mixing the two like this is not ideal. I wonder if you could use this code for #if DEBUG
and have an #else
(Release) that was straight async code for Release builds? Also not ideal, but I think if you encapsulate then at least it is one particular spot to watch.
Yeah, I don't know that the stack size can be controlled either. Another option (maybe) is to put
@MainActor
on it. This would work fine for a command line tool like this, but wouldn't work for an application, e.g. if you wanted to run this on iOS. This might work for you, depending on what you are doing.As for how to do that when you have a Debug build, one thing that I tried much earlier was to add
-O3
inmlx-swift
'sPackage.swift
.For example around line 87:
cxxSettings: [ .headerSearchPath("mlx"), .headerSearchPath("include/mlx-c"),
You can add something like
.unsafeOptions(["-O3"]),
. This isn't suitable for putting in thePackage.swift
in the repo as it makes it unbuildable in most situations (limitations around the unsafe options) but I think it ought to work for development.
Maybe unsafeFlags
?
Maybe
unsafeFlags
?
Yeah, that's the ticket :-)
@davidkoski @zach-brockway I have tried all optimization flags, only -O3 does not work, -O, -O1, -O2 and -Os are all working.😂
The deep recursion issues are fixed as of https://github.com/ml-explore/mlx-swift/pull/67
Eval in an async context is safe. You can even run Debug builds!