audiomentations icon indicating copy to clipboard operation
audiomentations copied to clipboard

Refactor MP3 compression backend to replace unmaintained pydub with ffmpeg

Open Bhuman-Patel opened this issue 7 months ago • 1 comments

Summary

This PR replaces the pydub-based MP3 compression backend with a modern ffmpeg-based approach, addressing issue #373.

Key Changes

  • Replaced apply_pydub() with apply_ffmpeg_mp3_compression()
  • Introduced "ffmpeg" backend as a cleaner, maintained alternative
  • Removed pydub from all dependency files
  • Added soundfile as an actively maintained WAV writer

Motivation

pydub has been unmaintained since 2022, and this usage already depended on ffmpeg under the hood. The new backend simplifies the stack and improves maintainability.

Closes #373

Bhuman-Patel avatar Jun 02 '25 20:06 Bhuman-Patel

Thanks for taking the time to make a PR/contribution! Here's some feedback:

  • Please remove "# Added imports for soundfile and subprocess". Such comments are commonly added by LLMs trying to be helpful, but I don't think it's useful in the final result.
  • Please do not add .DS_Store files to the repository. It's a good idea to add a rule for this type of file in .gitignore
  • You are proposing to replace "pydub" with "ffmpeg", but I think it's better to add "ffmpeg" without removing pydub. To ease the transition, keep the pydub option intact for now, but deprecate it with a message about which option to use instead in the future ("ffmpeg").
  • There's a mismatch: In the docstring and API docs you replaced "pydub" with "ffmpeg", but forgot to change the API accordingly.
  • For data augmentation like this, less I/O is better. In this code, we are first writing a wav file, then loading that wav files while encoding it and writing it as a new file (mp3), then loading that mp3 file from disk while decoding it. I do not consider that a fast approach. A faster approach would do it all in memory, with pipelining, possibly doing the encoding and decoding on separate CPU cores concurrently. If you are feeling adventurous, you could considering making an implementation like this.

iver56 avatar Jun 03 '25 08:06 iver56

I am now closing this PR due to lack of response, and because I want to take this transform in a different direction, with https://github.com/iver56/fast-mp3-augment

iver56 avatar Jun 28 '25 10:06 iver56