benchmarks Add support for FUSS/MUSDB separation task

Summary

This PR adds initial support for two new source separation tasks in DASB:

FUSS (general audio)
MUSDB18 (music)

These changes include the code required to prepare datasets from their respective raw sources and to compute standard source separation evaluation metrics.

Added Structure

FUSS/

create_fuss.py: Converts FUSS eval data into a supervised training-compatible format.
fuss_prepare.py: Builds the DASB-style manifest for FUSS.
utils.py: Utility functions (file I/O, path mgmt, etc.)
metrics/bsseval.py: Implements BSSeval metrics (SDR, SIR, SAR).

MUSDB/

create_musdb.py: Creates chunked training data from original MUSDB train set.
create_musdb_eval.py: Chunkifies eval/valid splits for supervised testing.
musdb_prepare.py: Builds DASB-style manifest for MUSDB.
utils.py: File and audio handling utilities.
metrics/bsseval.py: Same as FUSS version but task-local for now.

Motivation

These additions bring support for non-speech source separation benchmarks to DASB, expanding its scope beyond Libri2Mix. This lays the foundation for consistent evaluation of discrete audio tokens across music and general audio separation tasks.

Notes

Evaluation metrics are implemented separately in each task dir but can later be unified if desired.
No model training or inference logic is included yet; that will follow in a future PR.

Testing

All data preparation scripts have been run on local machines with subsets of FUSS and MUSDB.
Output manifests are consistent with Libri2Mix format.
Evaluation metrics produce valid SDR/SIR/SAR when tested on toy outputs.

Next Steps

Add train.py and hparams/ for both FUSS and MUSDB.

Jun 26 '25 06:06 darius522

I have updated the main and DASB branch to address the CI failures due to a github change, merging the upstream DASB into this branch should make the CI run properly again

Jul 15 '25 15:07 pplantinga

@darius522 could you please resolve the conflict

Jul 23 '25 19:07 poonehmousavi