mlx-audio Repeated model download despite presence in Hugging Face cache

Description:

When using the mlx_audio.tts.generate command with a Hugging Face model identifier, the model is downloaded every time the command is run, even if the model files are already present in the Hugging Face cache. This results in unnecessary bandwidth usage and delays.

Steps to reproduce:

Run the following command:

python -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B-4bit --text "Hello world"

Observe that the model (Dia-1.6B-4bit) is downloaded. The downloaded files are stored in the Hugging Face cache, typically located at ~/.cache/huggingface/hub/models--nari-labs--Dia-1.6B (or similar).

Run the exact same command again:

python -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B-4bit --text "Another phrase"

Observe that the model is downloaded again, even though the files are already present in the cache.

Relevant Code Snippet (from [mlx_audio/tts/utils.py]):

def get_model_path(path_or_hf_repo: str, revision: Optional[str] = None) -> Path:
    """
    Ensures the model is available locally. If the path does not exist locally,
    it is downloaded from the Hugging Face Hub.

    Args:
        path_or_hf_repo (str): The local path or Hugging Face repository ID of the model.
        revision (str, optional): A revision id which can be a branch name, a tag, or a commit hash.

    Returns:
        Path: The path to the model.
    """
    model_path = Path(path_or_hf_repo)

    if not model_path.exists():
        model_path = Path(
            snapshot_download(
                path_or_hf_repo,
                revision=revision,
                allow_patterns=[
                    "*.json",
                    "*.safetensors",
                    "*.py",
                    "*.model",
                    "*.tiktoken",
                    "*.txt",
                    "*.jsonl",
                    "*.yaml",
                ],
            )
        )

    return model_path

Installation Method:

Installed from Git repository branch main commit ab45cdeb8a7e61ec9ad431f91d5200202fe32d8a followed by pip install .

Expected Behavior:

The model should be downloaded only the first time the command is run. Subsequent runs with the same model identifier should load the model from the Hugging Face cache without requiring a full download.

Observed Behavior:

The model is downloaded every time the mlx_audio.tts.generate command is executed with a Hugging Face model identifier.

Workarounds (if any):

I tried providing the full path to the cached model directory (e.g., ~/.cache/huggingface/hub/models--mlx-community--Dia-1.6B-4bit) directly to the --model but then I hit another problem as it was identifying the model as model_type 'bit', which doesn't exist.

May 15 '25 02:05 NiltonVolpato

I can run huggingface-cli download from the command line and it uses the cached version, but not when loading via mlx-audio:

❯ huggingface-cli download mlx-community/Dia-1.6B-4bit Fetching 5 files: 100%|█████████████████████████████████████████████████| 5/5 [00:00<00:00, 42711.85it/s] /Users/nilton/.cache/huggingface/hub/models--mlx-community--Dia-1.6B-4bit/snapshots/d74afc38d6ee8909c1cf17ba2c281cfc66423bb7 ❯ huggingface-cli download mlx-community/Dia-1.6B-4bit Fetching 5 files: 100%|█████████████████████████████████████████████████| 5/5 [00:00<00:00, 65741.44it/s] /Users/nilton/.cache/huggingface/hub/models--mlx-community--Dia-1.6B-4bit/snapshots/d74afc38d6ee8909c1cf17ba2c281cfc66423bb7

May 15 '25 03:05 NiltonVolpato

I was looking at this again, and I think it's working correctly. I think for some reason it downloaded the model twice, but then it didn't anymore. But I got confused with the other progressbar that's shown (that is not really downloading anything). I'm just going to close this. Sorry for the false alarm.

May 18 '25 10:05 NiltonVolpato

Hey @NiltonVolpato

Thanks for raising the issue!

No worries, feel free to raise any concerns at anytime.

May 18 '25 12:05 Blaizzy