audio icon indicating copy to clipboard operation
audio copied to clipboard

Supporting music use cases in TorchAudio

Open hwangjeff opened this issue 2 years ago • 4 comments

Hello all!

Currently, TorchAudio doesn’t provide much support for music use cases. We’d like to gauge interest from the community in our improving that support. Some requests we’ve received include operators for chroma feature extraction, spectral flux, beat detection, and onset detection, and pretrained models for objective evaluation, e.g. VGGish, PaSST, CLAP. How compelling would these features be? Are there other features that would be useful to add?

hwangjeff avatar Jun 07 '23 18:06 hwangjeff

Hi! If none is working on it, I would like to work on the models.

pablf avatar Jun 27 '23 20:06 pablf

Hi @pablf, sorry for the late reply. That sounds great. Which models would you like to work on?

hwangjeff avatar Jul 24 '23 22:07 hwangjeff

I have been working on the CLAP model. I am hoping to upload a draft within a few days, so we can discuss it.

pablf avatar Jul 27 '23 19:07 pablf

@pablf sounds good — looking forward to it

hwangjeff avatar Jul 31 '23 15:07 hwangjeff