Yoshua Bengio
Yoshua Bengio
Choose a format and associated object class to hold the different speech features (some may be global to the sequence). Coordinate with +Pascal Vincent on this (project-wide data organization).
Select and code up advanced feature extraction algorithms that may be relevant to the task of emotion classification (look at what the previous papers on the subject have reported).
Setup and train deep convolutional network for speech-to-emotion with input window over the features extracted by +Nicolas and max-pooling on the output sequence. Train it first on the challenge data,...
Setup and train deep MLP for speech-to-emotion with input window over the features extracted by +Nicolas and max-pooling on the output sequence. Train it first on the challenge data, then...
Code and train a bidirectional recurrent net for speech-to-emotion mapping the feature sequence to a single probabilistic output through a final max-pooling (or other) aggregation. Try pooling the top hidden...
Get other relevant datasets which could help training an emotion classifier from speech, including music+mood data, unlabeled speech with emotional content, and most importantly, emotion+speech data such as in http://emotion-research.net/databases...
Yann and Nicolas will study the literature on this subject and report about it in the speech wiki page. Pay attention to: - datasets which may get our hands on...
Code up both a learning free and a learned convolutional aggregation followed by one of the fixed pooling methods (max by default). The learning-free convolutional aggregation is just performing an...
Coordinate with PV for interfacing with single-frame classifier output. Code up and unit-test a variety of reasonable and simple heuristics for aggregating these outputs temporally: max, mean, p-norm, noisy-or (1...