mlx icon indicating copy to clipboard operation
mlx copied to clipboard

[BUG] When dealing with Irreversible matrix, `mx.linalg.inv`will cause core dumped.

Open Redempt1onzzZZ opened this issue 8 months ago • 3 comments

Describe the bug When dealing with Irreversible matrix, mx.linalg.invwill cause core dumped.

To Reproduce

Include code snippet

import mlx.core as mx
import numpy as np

a = mx.array([[4,1],[4,1]],dtype=mx.float32)
print(a)
b = mx.linalg.inv(a)
print(b)
Image

Expected behavior It should throw a error.

Desktop (please complete the following information):

  • MacOS 15.1.1
  • Version 0.24.1

Redempt1onzzZZ avatar Apr 22 '25 08:04 Redempt1onzzZZ

This is a tricky one to fix, but it would be good to work towards being able to catch errors from another thread.

I think one thing we could think about is to have an error flag on the cpu command encoder and set that flag when an op fails. Then have the main thread check the flag at certain points. But we also need to be careful to leave the eval in a valid state to make this work.

awni avatar Apr 22 '25 13:04 awni

I notice that in /mlx/mlx/backend/cpu/inverse.cpp, in the func general_inv, current src code simply judge the info!=0, maybe adding an extra judgement if (info > 0) can solve this problem.

My understanding of the source code of MLX is still limited, so it might be a wrong idea. I attempted to modify the source code for some development, but encountered a build issue and am still trying to solve it

Redempt1onzzZZ avatar Apr 23 '25 00:04 Redempt1onzzZZ

It's weird, i'm not getting the core dump error but this instead, which i think is pretty explicit that it is failing to invert an invertible matrix.libc++abi: terminating due to uncaught exception of type std::runtime_error: [Inverse::eval_cpu] LU factorization failed with error code 2

mihirneal avatar Apr 23 '25 22:04 mihirneal