acdnet icon indicating copy to clipboard operation
acdnet copied to clipboard

Overfitting when training ACDNet20 in TensorFlow

Open mrerdem opened this issue 7 months ago • 0 comments

Thank you for the paper and the repository.

When I try to follow the procedure for "B. Rebuilding ACDNet20 in Tensorflow" the model starts overfitting after just several epochs and final validation accuracy does not go beyond ~30%. I am using "micro_acdnet_pruned_trained_fold4_86.00" as the reference torch model in the first input.

I should also note that the structure of pretrained tf model (acdnet20_20khz_fold4) does not match any of the 4 torch models provided. The pre-trained TF model has conv filters of [4, 32, 12, 23, 18, 38, 43, 62, 58, 77, 37, 50] whereas torch model has [7, 20, 10, 14, 22, 31, 35, 41, 51, 67, 69, 48]. Anyway, I also tried using the architecture of the pre-trained TF model to train in TF from scratch, but overfitting is still there.

At first I thought it is because weight_decay parameter in SGD is now deprecated. However, I implemented L2 regularization for the same functionality in all Conv layers, but overfitting is still there.

Do you have any suggestion to fix this issue?

mrerdem avatar Jul 01 '24 19:07 mrerdem