mlx-examples icon indicating copy to clipboard operation
mlx-examples copied to clipboard

Examples in the MLX framework

Results 167 mlx-examples issues
Sort by recently updated
recently updated
newest added

This feature allows us to dequantize a 4-bit-quantized model.

Recently I got a flow working where I would train a model with mlx (this is new for me) and then move over to llama.cpp to do the conversion to...

enhancement

When streaming using mlx_lm/server.py we should predict potential stop sequence matches, and generate tokens until we know that there is no match. This prevents the server from sending parts of...

When I run mlx_lm.convert for berkeley-nest/Starling-LM-7B-alpha my mlx model suddenly has 32003 instead of 32002 tokens. This creates issues if you want to train and later export a .gguf file...

- [x] Sort dataset prior to batching for more consistent lengths - [x] Compile non MOE models - [x] Add checkpointing as an option `--grad-checkpoint` ## Compile Benchmarks Decent gain...

The config file uploaded to huggingface repos by `mlx-lm` is not sorted currently. Need updates to sort the config files first and also update the `_name_or_path` key in `config.json` with...

The implementation of stop_criteria in mlx_lm.server is inherently flawed. Stop sequences only get matched when the newest tokens generated perfectly match a stop sequence. However it does not stop if...

When trying to convert fine-tuned weights of StarCode2 via ```python python -m mlx_lm.convert \ --hf-path m-a-p/OpenCodeInterpreter-SC2-7B \ -q \ --upload-repo mlx-community/OpenCodeInterpreter-SC2-7B-4bit ``` I encountered ``` -> tokenizer = AutoTokenizer.from_pretrained(model_path_hf) return...

Introduce one Reinforcement Learning from Human Feedback (RLHF) example, such as Direct Preference Optimization (DPO) method. **Paper** [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://arxiv.org/abs/2305.18290) **Notes** [Direct...

enhancement

When playing with fine-tuning sometimes I change from_linear in lora.py to play with them. Should we add command line args for these?

enhancement