SlimConv
SlimConv copied to clipboard
When to use slimconv?
Hi, Thanks for your work, I want to know if I use slimconv to my model, Is there need to replace all the conv to slimconv? The question is when should I use slimconv to replace conv?
Thanks for your attention, you can use our slimconv to replace the normal 3x3 conv, but the following layer should better be 1x1 conv to rise the dimension of features.
Hi, @JiaxiongQ 请问是否一定要在做完slim_conv之后立马接一个conv11,比如我的特征图通道数是64,那么slim_conv之后拿48通道数的特征图做一些进一步处理之后,再接一个conv11恢复成64,这样可以吗?这么做只是为了减少参数。
如果做完slim_conv之后直接用conv1*1,那么似乎并没有带来计算参数上面的优势???
也可以,只要slimconv裁剪通道之后的channel数和你设置的匹配都可以
On Fri, Jan 15, 2021 at 12:52 PM Egqawkq [email protected] wrote:
Hi, @JiaxiongQ https://github.com/JiaxiongQ 请问是否一定要在做完slim_conv之后立马接一个conv1 1,比如我的特征图通道数是64,那么slim_conv之后拿48通道数的特征图做一些进一步处理之后,再接一个conv1 1恢复成64,这样可以吗?这么做只是为了减少参数。
如果做完slim_conv之后直接用conv1*1,那么似乎并没有带来计算参数上面的优势???
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JiaxiongQ/SlimConv/issues/2#issuecomment-760644105, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJANJRDACJI3JCRSRMYO3ADSZ7CX7ANCNFSM4V3G7GSA .
Hi @JiaxiongQ 您好,首先非常感谢您能够提出非常优秀的卷积结构。 我尝试着复现您在resnet20的结果,于是修改您分享的SC-ResNet.py文件以便它能够在cifar数据集上运行,我把最终3/4C的通道直接还原成了完整的C通道(也就是把下路的1/4C扩到了1/2C)。然而我复现得到的resent20的实验结果(90.55%)远小于base model的精度(92.04%)。 我使用的训练参数是:SGD with weight decay= 5e-4, batch-size = 128, initial learning rate = 0.1 然后每50epochs降低为原来的0.1。 同样在这种情况在resnet56也是如此。请问我复现使用的参数跟您之前做的时候比有什么疏漏的地方吗? 您可否分享一下关于cifar的训练和模型文件呢? 非常期待作者的回复。
谢谢您的关注。 1.我们的模型需要每个学习率阶段的训练epoch大一点,建议每80个epoch下降0.1。 2.我们之前的实验都是针对的res bottleneck,附件中有针对res block的改进,在cifar100上做实验是有一定的提升的。
On Sat, Feb 13, 2021 at 7:08 AM zsureuk [email protected] wrote:
Hi @JiaxiongQ https://github.com/JiaxiongQ 您好,首先非常感谢您能够提出非常优秀的卷积结构。 我尝试着复现您在resnet20的结果,于是修改您分享的SC-ResNet.py文件以便它能够在cifar数据集上运行,我把最终3/4C的通道直接还原成了完整的C通道(也就是把下路的1/4C扩到了1/2C)。然而我复现得到的resent20的实验结果(90.55%)远小于base model的精度(92.04%)。 我使用的训练参数是:SGD with weight decay= 5e-4, batch-size = 128, initial learning rate = 0.1 然后每50epochs降低为原来的0.1。 同样在这种情况在resnet56也是如此。请问您可否分享一下关于cifar的训练和模型文件呢? 非常期待作者的回复。
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JiaxiongQ/SlimConv/issues/2#issuecomment-778504951, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJANJRCFTQQGVONSQOJQUALS6WYIJANCNFSM4V3G7GSA .
class BasicBlock(nn.Module): expansion = 1 def init(self, in_planes, planes, stride=1, option='A',cnt=0): super(BasicBlock, self).init()
if cnt>-1: self.conv1 = myconv_3x3R(in_planes,stride=stride, kernel_size=3, padding=1, bias=False) else: self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) self.bn1 = nn.BatchNorm2d(planes)
self.conv2 = nn.Conv2d(in_planes//2+in_planes//4, planes, kernel_size=3, stride=1, padding=1, bias=False) self.bn2 = nn.BatchNorm2d(planes)
self.shortcut = nn.Sequential() if stride != 1 or in_planes != planes: if option == 'A': """ For CIFAR10 ResNet paper uses option A. """ self.shortcut = LambdaLayer(lambda x:F.pad(x[:, :, ::2, ::2], (0, 0, 0, 0, planes//4, planes//4), "constant", 0)) elif option == 'B': self.shortcut = nn.Sequential( nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(self.expansion * planes) )
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = F.relu(self.conv1(x)) out = self.bn2(self.conv2(out)) out += self.shortcut(x) out = F.relu(out) return out
On Sat, Feb 13, 2021 at 10:50 AM Jiaxiong Qiu [email protected] wrote:
上面的附件中有点问题,请参考这个
On Sat, Feb 13, 2021 at 10:48 AM Jiaxiong Qiu [email protected] wrote:
谢谢您的关注。 1.我们的模型需要每个学习率阶段的训练epoch大一点,建议每80个epoch下降0.1。 2.我们之前的实验都是针对的res bottleneck,附件中有针对res block的改进,在cifar100上做实验是有一定的提升的。
On Sat, Feb 13, 2021 at 7:08 AM zsureuk [email protected] wrote:
Hi @JiaxiongQ https://github.com/JiaxiongQ 您好,首先非常感谢您能够提出非常优秀的卷积结构。 我尝试着复现您在resnet20的结果,于是修改您分享的SC-ResNet.py文件以便它能够在cifar数据集上运行,我把最终3/4C的通道直接还原成了完整的C通道(也就是把下路的1/4C扩到了1/2C)。然而我复现得到的resent20的实验结果(90.55%)远小于base model的精度(92.04%)。 我使用的训练参数是:SGD with weight decay= 5e-4, batch-size = 128, initial learning rate = 0.1 然后每50epochs降低为原来的0.1。 同样在这种情况在resnet56也是如此。请问您可否分享一下关于cifar的训练和模型文件呢? 非常期待作者的回复。
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JiaxiongQ/SlimConv/issues/2#issuecomment-778504951, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJANJRCFTQQGVONSQOJQUALS6WYIJANCNFSM4V3G7GSA .
@JiaxiongQ 非常感谢您的回复 在您的附近里我发现有这样两行代码: if cnt>-1: self.conv1 = myconv_3x3R(in_planes,stride=stride, kernel_size=3, padding=1, bias=False) 请问myconv_3x3R是class slim_conv_3x3(nn.Module)吗?还是另外定义的卷积呢?另外请问cnt是有什么特殊含义吗? 对于cifar这种每个basicblock只包含两个3x3的卷积的resent,等于说是只改动第一个3x3卷积,第二个3x3是为了恢复成C通道的常规3x3卷积哈
对的,cnt只是为了增加个超参来进行performance和prams/flops的trade-off。 myconv_3x3R是code release之前的名称,在basic_block上是这样: class myconv_3x3R(nn.Module): def init(self, in_planes, kernel_size=3, padding=1, bias=False, stride=1, dilation=1): super(myconv_3x3R, self).init() self.stride = stride
l1=2 l2=4
self.conv2_2 = nn.Sequential(nn.Conv2d(in_planes//l1, in_planes//l2, kernel_size=1, bias=False), nn.BatchNorm2d(in_planes//l2), nn.ReLU(inplace=True), nn.Conv2d(in_planes // l2, in_planes // l2, kernel_size=kernel_size, stride=stride, padding=padding, bias=False, dilation=dilation), nn.BatchNorm2d(in_planes // l2) )
self.conv2_1 = nn.Sequential(nn.Conv2d(in_planes // l1, in_planes // l1, kernel_size=kernel_size, stride=stride, padding=padding, bias=False, dilation=dilation), nn.BatchNorm2d(in_planes // l1) )
self.fc = nn.Sequential(nn.Conv2d(in_planes, in_planes // 8, kernel_size=1, bias=False), nn.BatchNorm2d(in_planes // 8), nn.ReLU(inplace=True), nn.Conv2d(in_planes // 8, in_planes,kernel_size=1), nn.Sigmoid()) self.pool = nn.AdaptiveAvgPool2d(1) self.l1 = l1
def forward(self, x): out = x ,c,,_ = out.size()
w = self.pool(out) w = self.fc(w) w_f = torch.flip(w,[1])
out1 = wout out2 = w_fout
fs1 = torch.split(out1, c//2, 1) fs2 = torch.split(out2, c//2, 1)
ft1 = fs1[0] + fs1[1] ft2 = fs2[0] + fs2[1]
out2_1 = self.conv2_1(ft1) out2_2 = self.conv2_2(ft2)
out = torch.cat((out2_1, out2_2), 1)
return out
On Sun, Feb 14, 2021 at 12:11 AM zsureuk [email protected] wrote:
@JiaxiongQ https://github.com/JiaxiongQ 非常感谢您的回复 在您的附近里我发现有这样两行代码: if cnt>-1: self.conv1 = myconv_3x3R(in_planes,stride=stride, kernel_size=3, padding=1, bias=False) 请问myconv_3x3R是class slim_conv_3x3(nn.Module)吗?还是另外定义的卷积呢?另外请问cnt是有什么特殊含义吗?
对于cifar这种每个basicblock只包含两个3x3的卷积的resent,等于说是只改动第一个3x3卷积,第二个3x3是为了恢复成C通道的常规3x3卷积哈
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JiaxiongQ/SlimConv/issues/2#issuecomment-778639228, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJANJRAPIYSYPOEX3A7CUSLS62QDJANCNFSM4V3G7GSA .