Yän.PnG comments

Repositories
Issues
Comments

Results 2 comments of


Yän.PnG

Getting text/audio embeddings (and their gradients) from the pretrained models.

hi! I am using [kaldi APIs](https://github.com/zhaoyanpeng/vipant/blob/93b06ff43ae6a76323cecea4c10cf457945c2711/cvap/data/audio/transform.py#L29) of [torchaudio](https://pytorch.org/audio/stable/compliance.kaldi.html#fbank). I think you are right: the transform function does not seem to produce any gradients, so no way to run gradients through...

Getting text/audio embeddings (and their gradients) from the pretrained models.

if you want to encode your own audio-text data w/ a pre-trained VA model, you would need to modify this [function](https://github.com/zhaoyanpeng/vipant/blob/93b06ff43ae6a76323cecea4c10cf457945c2711/cvap/model/clap.py#L42-L53) to directly save the audio and text features. You...