mlx-swift-examples icon indicating copy to clipboard operation
mlx-swift-examples copied to clipboard

Examples using MLX Swift

Results 128 mlx-swift-examples issues
Sort by recently updated
recently updated
newest added

as [tech report](https://github.com/OpenBMB/MiniCPM/blob/main/report/MiniCPM_4_Technical_Report.pdf) mentioned, it is really fast. mlx version already existed: https://huggingface.co/mlx-community/MiniCPM4-8B-4bit

Nice release! However, I've run into a couple of small issues with Gemma3, any help/insights or fixes would be greatly appreciated! Thank you! - First off, I'm puzzled about this...

Currently, the `Generation` enum has three cases: `chunk`, `info`, and `toolCall`. Many newer APIs (such as Ollama’s `thinking` property in `Message`) now include special properties for "thinking" directly in their...

Creating a halved version of bge-large using the following python code: ```python hf_model = AutoModel.from_pretrained("BAAI/bge-large-en-v1.5") hf_model.half() hf_model.save_pretrained(model_path) tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-large-en-v1.5") tokenizer.save_pretrained(tokenizer_path) ``` Seems to work just fine. However, loading this...

When I work on the prompt cache with the KV quant cache, I've noticed maybeQuantizeKVCache converts the SimpleKVCache to the quantized KV cache. However, the cache reference passed to TokenIterator...

See https://github.com/ml-explore/mlx-swift-examples/pull/238/files The gemma3_text model has an `input_embeddings` parameter: - https://github.com/ml-explore/mlx-lm/blob/main/mlx_lm/models/gemma3_text.py#L224 Per the python docs on `generate_step`: ``` input_embeddings (mx.array, optional): Input embeddings to use in place of prompt tokens....

In the example applications, on macOS, we could allow users to select a download directory for the weights. For example they could pick ~/.cache to match the python download directory...

I'm on version 2.21.2 but it fails to build with error: Ambiguous use of 'dictionary' in the Tokenizer.swift, is there any workaround? any help is appreciated.