Max Caldwell comments

Results 8 comments of


                                            Max Caldwell

[show and tell] apple mps support

I got this working as well! Inference time seems to increase more than linearly with prompt size - 3 seconds of audio: 10 seconds of generation - 8s of audio:...

Error: pip install piper-tts | Installing with Python doesn't seem to work, missing packages and dependency conflict.

Same here on Mac with Python 3.11

Better instruction needed

+1 agreed, but In the CLI lib here: https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/utils.py You can see some arguments available which might work like you're asking for. There are some models available that if you...

[Survey] Supported Hardwares and Speed

@junrushao how can we find tokens/sec? I'd say 'quite fast' fastest LLM I've run on this 2020 MacBook Pro M1 8G. 10x faster than your WebGPU demo running with less...

[Survey] Supported Hardwares and Speed

Killer, I'm at encode: 31.9 tok/s, decode: 11.4 tok/s for 2020 MacBook Pro M1 8G with the default vicuna 6b. For reference my decode on the WebGPU demo is like,...

Stop tokens appear in the model output.

Confirmed this is also happening for the new Hermes Pro model with many different variations of this template. TEMPLATE """{{ if .System }}system {{ .System }} {{ end }}{{ if...

Stop tokens appear in the model output.

Update from @mchiang0610 — all of our files need these for ChatML PARAMETER stop PARAMETER stop @olafgeibig try yours without the quotation marks?

Contribute Hugging Face models to the MLX Community

Is there any more information about what's needed to author a `convert.py` for a given model? I'm seeing a lot of similarities between them in terms of loading the weights...