mlx-examples icon indicating copy to clipboard operation
mlx-examples copied to clipboard

mlx_whisper: add support for audio input from stdin

Open anthonywu opened this issue 4 months ago • 2 comments

Problem

I wanted to pipe an audio file to mlx_whisper, but found it only accepted file paths. This PR will allow mlx_whisper to accept stdin and pass it to ffmpeg accordingly then allow the rest of the workflow to go on as usual.

Changes

  1. load_audio helper adjusts ffmpeg flags based on file path vs. stdin mode
  2. CLI parser will gracefully omit the otherwise-required positional audio arg if stdin is determined to be active
  3. optionally, --input-name arg is supported to help users name the otherwise anonymous stdin content (cannot guess from file path)
  4. added tests in macOS standard zsh file to drive and test the changes from the CLI

Process

  1. ran black and pre-commit on changes prior to PR
  2. python test.py shows 4 errors, some regarding floating point comparisons. Looks very far away from my change, may be known issues.

anthonywu avatar Oct 03 '24 10:10 anthonywu