DistributionLoss First activation function without quantization?

First activation function without quantization?

Open Hyungjun-K1m opened this issue 5 years ago • 2 comments

Hi Ruizhou,

Thanks for sharing your code!

While going through your code, I found you used LeakyReLU for the first activation function and didn't quantize its output. Therefore it seems the second convolution layer takes full precision input instead of binary input. Previous works (i.e. XNOR-Net, DoReFa-Net) quantize the first activation as well.

Have you tried to quantize the first activation layer too?

Jun 16 '19 07:06 Hyungjun-K1m

Hi Hyungjun,

Thanks for pointing this out. I just tried changing the first layer to:

    self.features0 = nn.Sequential(
        nn.Conv2d(self.channels[0], self.channels[1], kernel_size=11, stride=4, padding=2),
        nn.MaxPool2d(kernel_size=3, stride=2),
        nn.LeakyReLU(inplace=True),
        nn.BatchNorm2d(self.channels[1]),
    )
    self.features1 = nn.Sequential(
        self.activation_func(),
        BinarizeConv2d(self.channels[1], self.channels[2], kernel_size=5, padding=2),
        nn.MaxPool2d(kernel_size=3, stride=2),
        nn.BatchNorm2d(self.channels[2]),
    )

and inserting a distribution loss layer between self.feature0 and self.feature1 (right before the self.activation_func() in self.feature1). Without changing the hyper-parameters, I can get 47.2% top-1 accuracy.

Thanks, Ruizhou

Jun 18 '19 15:06 ruizhoud

Thanks a lot!

Jun 19 '19 04:06 Hyungjun-K1m

DistributionLoss DistributionLoss copied to clipboard

First activation function without quantization?

DistributionLoss
DistributionLoss copied to clipboard