Make-An-Audio feat(ui): introduce single‑file Gradio demo (app.py)

feat(ui): introduce single‑file Gradio demo (app.py)

Open eoffermann opened this issue 6 months ago • 0 comments

Summary

Adds a self-contained Gradio front-end that turns text prompts into audio clips using the existing diffusion sampler and BigVGAN vocoder. Launch it with one command, explore results in the browser, download clips on demand—no disk writes unless the user clicks Download.

Highlights

Zero configuration: python app.py opens http://127.0.0.1:7860.
Five inputs: prompt, DDIM steps, duration, guidance scale, sample count (up to 10).
Parallel previews: up to 10 audio players appear dynamically; each has a built-in download button.
Stateless: all artefacts remain in RAM; nothing persists after the session.
Efficient cold-start: models load once at import; subsequent generations reuse them.

How to run

python app.py

That’s it—the default browser will open automatically.

Implementation notes

Tested locally on CUDA 12.4 GPU and on a CPU-only machine.
generate_and_update always returns a list of exactly MAX_AUDIO_PLAYERS gr.update objects, keeping Gradio’s diffing predictable.
TODOs are embedded in the docstring (GPU OOM handling, input validation, seed control).

Checklist

[x] Code follows project style and PEP 8.
[x] Comprehensive docstrings and inline comments.
[x] No new runtime dependencies (except gradio)
[x] Manual tests: GPU (CUDA 12.4) and CPU paths.

Apr 18 '25 21:04 eoffermann

Make-An-Audio Make-An-Audio copied to clipboard

feat(ui): introduce single‑file Gradio demo (app.py)

Summary

Highlights

How to run

Implementation notes

Checklist

Make-An-Audio
Make-An-Audio copied to clipboard