essentia
essentia copied to clipboard
Speech analysis features wishlist
Hi,
I detected some features and algorithms used in Speech Processing unavailable in Essentia which might be interesting to implement. Here my proposals:
- Mel Spectrogram
- Delta MFCC
- Delta-delta MFCC
- Vocal Tract Filtering
- Phase Distortion
- Phase Distortion standard Deviation
- Harmonic Model Phase Distortion
- Pulse Model
- Pre-Emphasis Filter
- Direction Of Arrival: MUSIC, CSSM, TOPS, SRP-PHAT
Thanks in advance
I would also suggest to add an option/parameter in StartStopSilence algorithm to provide a list of start and stop frames. Right now this algorithm provides just a start frame and stop frame, which is fine for a song. however in speech you have differnt events and intermediate frames with silence. It would be nice if we can provide a list for startFrame and stopFrame to segment events easily.