text_based_depression Features and model for audio only

hello, the audio only results in docs seems great, could you tell me what features do you use and model construction?

Nov 30 '22 04:11 haijing1995

Hey, I'm sorry which results are you referring to?

Nov 30 '22 06:11 RicherMans

Hey, I'm sorry which results are you referring to?

the audio only results on lstm and tcn in /docs/report.md

Nov 30 '22 10:11 haijing1995

Hey, thanks for noting these results, they were a part of the paper during the development process.

The "HighOrder" features are just the standard mean, median, second order, third order, max, min features extracted from a mel spectrogram.

Btw I don't think these results are all too "good", there were some notable better results we obtained with self-supervised learning such as this paper

Nov 30 '22 10:11 RicherMans

Hey, thanks for noting these results, they were a part of the paper during the development process.

The "HighOrder" features are just the standard mean, median, second order, third order, max, min features extracted from a mel spectrogram.

Btw I don't think these results are all too "good", there were some notable better results we obtained with self-supervised learning such as this paper

Thanks for your reply, I have a few more questions,

Each answer of each participant has a different length of time, so the extracted feature（eg. mel-spectrogram） length is also different.
Different participants had different numbers of responses. In order to be able to train in batches, how do you unify these two dimensions(not learning x-feature in the paper you mentioned)?

Nov 30 '22 11:11 haijing1995

Each answer of each participant has a different length of time, so the extracted feature（eg. mel-spectrogram） length is also different.

We used a batchsize of 1 for training, which did not add any padding.

Different participants had different numbers of responses. In order to be able to train in batches, how do you unify these two dimensions(not learning x-feature in the paper you mentioned)?

We really did train with a batch size of 1 for most papers since the difference as you mention between samples is substantial. However as a note from us, the dataset is very small for common scientific standards, which leads to a very large variance between most experiments, so do not expect to run our experiments a single time and obtain the same result. The random seed on this dataset has a far larger impact than most "optimization" methods.

Nov 30 '22 14:11 RicherMans

Thanks a lot for your help, I will try.

Dec 01 '22 01:12 haijing1995

text_based_depression text_based_depression copied to clipboard

Features and model for audio only

text_based_depression
text_based_depression copied to clipboard