David Koski

Results 259 comments of David Koski

Yes, I agree that making separate entry points if you want finer control makes sense: - download - load - download + load (convenience) Do you want to make a...

First the question of the timeout: you would need to see what it was doing at the time. It is possible that it took more physical memory than you had...

I think you want something like this: https://github.com/ml-explore/mlx-swift-examples/blob/main/Libraries/MLXLMCommon/Evaluate.swift#L12 You can supply your own logit sampler (returning a single value MLXArray. If that isn't quite what you are looking for, then...

OK, the logit sampler ought to be able to do that -- it has all the logits and its only task is to return a single token.

Closing this -- no additional feedback. Please file another issue if the logit sampler is not adequate to the task.

I presume this is a model (multilingual_e5_small) that requires an unsupported tokenizer? It looks like there is a Roberta post processing unit: - https://github.com/huggingface/swift-transformers/blob/d42fdae473c49ea216671da8caae58e102d28709/Sources/Tokenizers/PostProcessor.swift#L86 If this tokenizer is similar to...

I don't think it belongs in `ModelContainer` -- that is scoped to the lifetime of the model (weights really). The KVCache is more like a session. Think of it in...

> I'm not much of an expert on Sendable so I'm not sure what tricks could be used to have the MLXArray stored outside of the ModelContainer. I think you...

Yes, this is a limitation in what we have. There are two approaches: nothing (or most things) don't throw. If you use them wrong, it is a programmer error and...

Overall I like this direction. I think it needs: - finish the borrowing of the KVCache (implement a visitor that holds a lock so we can satisfy the `unchecked Sendable`)...