tartarus
tartarus copied to clipboard
the architecture for only audio training
I want to repeat you experiment for audio regression, but i cannot found the archituecture only for audio. In the rec_dense module it's architecture=1 in the code,but architecture1 is only for metadata. There is no default params for only audio regression? so i want to know the exact params and architecture for audio cnn. The params in the paper is:relu,filter num:256,512,1024,1024,maxpool size is 4,0.5 dropout for all layers,flatten output is 4096, no dense layer。my question is:
- what is the filter kenel size?i found in the code, the kernel size if (4,96),(4,1),(4,1),(1,1) and the pool size is (4,1),(4,1),(1,1),(1,1),but this params cannot get the flatten output is 4096!
- is the first kernel size(4,96)? can you help explain this?
- no dense layer?