mlx-swift icon indicating copy to clipboard operation
mlx-swift copied to clipboard

Fatal error: [metal:: Device] Unable to load kernel gatherbfloat16int32_1_1_int

Open armin976 opened this issue 3 months ago • 2 comments

I'm trying to run a INT4 quantized model on A12X gpu but im getting this error when loading the model. Upon further inspection it seems like up to A14 chips couldnt understand bf16, i'm wondering if theres a workaround that doesnt require me to only use the CPU.

armin976 avatar Sep 07 '25 16:09 armin976

I'm seeing the same. Here's a stack trace on an iPad Mini 5, which lacks hardware bfloat16 support. From the Cmlx code in MLX, it appears some minimal attempt was made to limit using bfloat16 on unsupported devices, but the safety code is not exhaustive enough.

MLX/ErrorHandler.swift:332: Fatal error: [metal::Device] Unable to load kernel gatherbfloat16int32_1_1_int

Task 41 Queue : com.apple.root.user-initiated-qos.cooperative (concurrent) #0	0x0000000186cf68c4 in _swift_runtime_on_report () #1	0x0000000186dcd958 in _swift_stdlib_reportFatalErrorInFile () #2	0x00000001869996e0 in closure #1 (Swift.UnsafeBufferPointer<Swift.UInt8>) -> () in Swift._assertionFailure(_: Swift.StaticString, _: Swift.String, file: Swift.StaticString, line: Swift.UInt, flags: Swift.UInt32) -> Swift.Never () #3	0x00000001869988c4 in _assertionFailure () #4	0x0000000111f26e74 in closure #1 in ErrorHandler.dispatch(_:) at ...mlx-swift/Source/MLX/ErrorHandler.swift:332 #5	0x0000000111f27e88 in partial apply for closure #1 in ErrorHandler.dispatch(_:) () #6	0x0000000111f21a40 in NSLocking.withLock<NSLock>(_:) () #7	0x0000000111f264c8 in ErrorHandler.dispatch(_:) at ...mlx-swift/Source/MLX/ErrorHandler.swift:328 #8	0x0000000111f26188 in errorHandlerTrampoline(message:data:) at ...mlx-swift/Source/MLX/ErrorHandler.swift:283 #9	0x0000000111f2605c in @objc errorHandlerTrampoline(message:data:) () #10	0x00000001120561f0 in ::_mlx_error(const char *, const int, const char *, ...) at ...mlx-swift/Source/Cmlx/mlx-c/mlx/c/error.cpp:51 #11	0x0000000112002df0 in ::mlx_array_eval(mlx_array) at ...mlx-swift/Source/Cmlx/mlx-c/mlx/c/array.cpp:324 #12	0x0000000111f7e8a8 in closure #1 in MLXArray.eval() at ...mlx-swift/Source/MLX/MLXArray.swift:553 #13	0x0000000111f7ee34 in partial apply for closure #1 in MLXArray.eval() () #14	0x0000000111f21a40 in NSLocking.withLock<NSRecursiveLock>(_:) () #15	0x0000000111f7cf08 in MLXArray.eval() at ...mlx-swift/Source/MLX/MLXArray.swift:552 #16	0x0000000111f7ad4c in MLXArray.item<Int>(_:) at ...mlx-swift/Source/MLX/MLXArray.swift:331 #17	0x000000011319b664 in RepetitionContext.didSample(token:) at ...mlx-swift-examples/Libraries/MLXLMCommon/Evaluate.swift:235 #18	0x000000011319b868 in protocol witness for LogitProcessor.didSample(token:) in conformance RepetitionContext () #19	0x000000011319d0d0 in TokenIterator.convertToToken(logits:) at ...mlx-swift-examples/Libraries/MLXLMCommon/Evaluate.swift:401 #20	0x000000011319cd90 in TokenIterator.step(previous:) at ...mlx-swift-examples/Libraries/MLXLMCommon/Evaluate.swift:420 #21	0x000000011319c168 in TokenIterator.prepare(input:windowSize:) at ...mlx-swift-examples/Libraries/MLXLMCommon/Evaluate.swift:381 #22	0x000000011319c664 in TokenIterator.init(input:model:cache:parameters:) at ...mlx-swift-examples/Libraries/MLXLMCommon/Evaluate.swift:339 #23	0x000000011319ed00 in generate(input:parameters:context:didGenerate:) at ...mlx-swift-examples/Libraries/MLXLMCommon/Evaluate.swift:581

colintheshots avatar Oct 16 '25 20:10 colintheshots

In general we don't build or test MLX on anything less than Metal 3. Those older devices do not support metal 3 (see table) and I don't recommend using MLX on them / you may have to do some non-trivial debugging to get it to work.

For a bit of color on bf16, older devices do not support it natively. For those devices we have a fallback emulated implementation. Your build should be using that.. but it's possible it doesn't compile for some reason or another.

The gather kernels are JIT compiled. So if you dig around hopefully you can find an error log explaining in more detail why the compilation failed. If you do find it and post it here will be happy to help further.

awni avatar Oct 17 '25 14:10 awni