mlx-vlm icon indicating copy to clipboard operation
mlx-vlm copied to clipboard

[WIP] Reduce deps core

Open Blaizzy opened this issue 1 month ago • 1 comments

Summary: This PR removes the dependency on torch, torchvision, and transformers by porting the necessary processors directly into mlx-vlm. It also restructures pyproject.toml to support optional installations.

Changes:

  • Removed Dependencies: Core installation no longer requires Torch or Transformers.
  • New Extras: Added optional flags for [trainer], [server], and [audio].
  • Refactoring:
    • Replaced mlx-audio with soundfile.
    • Moved audio imports to be lazy-loaded within functions to avoid crashes for users without audio dependencies.
    • Cleaned up redundant imports in utils.py.
  • Docs: Added installation instructions for optional dependencies to the README.

Blaizzy avatar Nov 19 '25 22:11 Blaizzy

Sort of related, have you considered replacing py-opencv which pulls in a rather hefty set of deps (120+)? It looks like it's currently only used to load and resize the frames of videos.

altaic avatar Nov 21 '25 03:11 altaic