tartarus the architecture for only audio training

the architecture for only audio training

Open zhegeliang2 opened this issue 7 years ago • 0 comments

I want to repeat you experiment for audio regression， but i cannot found the archituecture only for audio. In the rec_dense module it's architecture=1 in the code，but architecture1 is only for metadata. There is no default params for only audio regression? so i want to know the exact params and architecture for audio cnn. The params in the paper is：relu，filter num：256,512,1024,1024，maxpool size is 4，0.5 dropout for all layers，flatten output is 4096, no dense layer。my question is:

what is the filter kenel size？i found in the code， the kernel size if (4,96),(4,1),(4,1),(1,1) and the pool size is (4,1),(4,1),(1,1),(1,1)，but this params cannot get the flatten output is 4096!
is the first kernel size(4,96)? can you help explain this?
no dense layer?

Nov 24 '17 04:11 zhegeliang2

tartarus tartarus copied to clipboard

the architecture for only audio training

tartarus
tartarus copied to clipboard