David Koski comments

Results 272 comments of


                                            David Koski

Help with using LoRA adapter weights on a converted Qwen2.5 model in MLX

This means that there is a key `base_model` in the weights and there is no such key in the swift code. I am not sure what this key is for...

Help with using LoRA adapter weights on a converted Qwen2.5 model in MLX

`safetensors` is the native format for MLX, so that is fine. I wonder if PEFT produces the same structure of weights for the adaptors? @awni do you know about this?...

How to handle application becoming inactive?

First the easy part: ```swift Stream.gpu.synchronize() ``` is the call to wait for GPU activity to be done.

How to handle application becoming inactive?

For call `1`, it could potentially observe task cancellation, see #227 , but one of the calls, `eval(model)` can potentially take several seconds. Perhaps it could iterate over the parameters...

How to handle application becoming inactive?

Yes, I am not sure if the requester for #227 intended to submit a PR, but as-is it doesn't support cancellation. The notes in that issue should help if you...

How to handle application becoming inactive?

@MilanNosal I think what you did looks reasonable, though we would have to test it to make sure. `eval` on a single `MLXArray` will synchronously evaluate the graph that produces...

How to handle application becoming inactive?

I am not sure either -- that may be the time to first token cost. It requires evaluating the entire graph (the model) for a token. It may require JIT...

How to handle application becoming inactive?

It is possible via: - https://swiftpackageindex.com/ml-explore/mlx-swift/main/documentation/mlx/seterrorhandler(_:data:dtor:) You would probably set a global variable indicating that an error occurred and then check that. If the prompt is long you could potentially...

How to handle application becoming inactive?

that controls what size chunks it feeds the prompt in -- if using a smaller prefill size does it then that is a great way I think (and is the...

Update SDPA to use string "causal" or boolean mask

The latest tag of mlx-swift should support the boolean masks. We have another on coming soonish -- maybe I will wait and rev it to the latest if it happens...