koushik313
koushik313
SMU can be converted to CUDA and Tensorrt as well.
@mzzjuve Thanks for the information you shared. I suppose you use consider alpha=0.25 and mu=100000. Instead, I will recommend you try to **initialize alpha at 0.01 and mu at 2.0...
 @KMUST120, This is how max(x,0.25x) is approximated by SMU (alpha=0.25, mu=1.0). You can plot the same for SMU-1.
@HanAccount, First, fix a network with any random activation function (for example ReLU), then replace all the activation functions in the network with SMU or SMU-1 to see the effect...
No, for SMU it must not be 1000000.0, you can initialize it at 1.0 as well and it works remarkably compared to other widely used activations. SMU-1 is a computationally...
@Tears1997 Thanks for the information you shared. I will also recommend you try to initialize **alpha at 0.01 and mu at 2.0 or 2.5 (use mu as a trainable parameter)...
@Tears1997 Thank you for your reply. Yes, I agree, and for the classification problem, at alpha=0.25, the functions work well but for object detection, you need to choose alpha=0.01. But...
@iFe1er Please update the values of the parameters for SMU. They are already updated in the original paper.