speech-representation topics

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

ddlBoJack

iemocap

pytorch-implementation

speech-emotion-recognition

speech-representation

WavTokenizer

1.2k

Stars

102

Forks

1.2k

Watchers

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

jishengpeng

acoustic

audio-representation

codec

dac

WavChat

310

Stars

17

Forks

310

Watchers

A Survey of Spoken Dialogue Models (60 pages)

jishengpeng

duplex

encodec

gpt-4o

intreaction