CMU-MultimodalSDK
CMU-MultimodalSDK copied to clipboard
How to interpret data in COVAREP features?
As far as I know, the COVAREP acoustic features have 100Hz sample rate. For each video, the acoustic features nd.array has the shape (time_in_seconds*sample_rate, 74). I'm puzzled by the length of the second axis. What different types of data are stored along the 2nd axis?