DistributionLoss
DistributionLoss copied to clipboard
First activation function without quantization?
Hi Ruizhou,
Thanks for sharing your code!
While going through your code, I found you used LeakyReLU for the first activation function and didn't quantize its output. Therefore it seems the second convolution layer takes full precision input instead of binary input. Previous works (i.e. XNOR-Net, DoReFa-Net) quantize the first activation as well.
Have you tried to quantize the first activation layer too?
Hi Hyungjun,
Thanks for pointing this out. I just tried changing the first layer to:
self.features0 = nn.Sequential(
nn.Conv2d(self.channels[0], self.channels[1], kernel_size=11, stride=4, padding=2),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.LeakyReLU(inplace=True),
nn.BatchNorm2d(self.channels[1]),
)
self.features1 = nn.Sequential(
self.activation_func(),
BinarizeConv2d(self.channels[1], self.channels[2], kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.BatchNorm2d(self.channels[2]),
)
and inserting a distribution loss layer between self.feature0 and self.feature1 (right before the self.activation_func() in self.feature1). Without changing the hyper-parameters, I can get 47.2% top-1 accuracy.
Thanks, Ruizhou
Thanks a lot!