DeepPersonality
DeepPersonality copied to clipboard
Speech processing in UIDVA data set
The audio processing methods mentioned in the paper are as follows: 1、Extracting features directly from the entire audio segment, and 2、Segmenting the audio and extracting features from each segment separately. However, in the UDVIA dataset, each video segment consists of a dialogue between two individuals. When extracting features from the entire audio segment, should the presence of different speakers be taken into account?