Question on the activation for density in student NN for distillation

Open santolina opened this issue 4 years ago • 0 comments

Thanks a lot for your great work.

When debugging distillation, I found that sometimes alpha from student NN takes the negative value (but very close to 0). This comes from using different activation for density in teacher NN and student NN:

teacher NN: Relu (https://github.com/creiser/kilonerf/blob/master/local_distill.py#L335)
student NN: leakey Relu (https://github.com/creiser/kilonerf/blob/master/local_distill.py#L114)

I guess the difference may be minor, but is there any reason for using leaky Relu for student NN?

Feb 14 '22 07:02 santolina