spikingjelly
spikingjelly copied to clipboard
关于利用STDP优化器与梯度下降混合使用无法分类的问题
你好,我利用STDP优化器试图训练我自己的数据集,但遇到了无法分类的问题,请问可以帮我看看问题出在哪里吗?
以下为输出结果:
epoch : 0 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469
epoch : 1 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469
epoch : 2 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469
epoch : 3 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469
epoch : 4 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469
epoch : 5 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469
(由于我数据集一共分为十类,测试准确率只有10%,可以认为无法分类)
我的网络结构为:
class CSNN(nn.ModuleList):
def init(self, T: int, channels: int, use_cupy=False):
super(CSNN, self).init()
self.T = T
self.conv_fc = nn.Sequential(
layer.Conv2d(3, channels, kernel_size=3, padding=1, bias=False),
layer.BatchNorm2d(channels),
neuron.IFNode(surrogate_function=surrogate.ATan()),
layer.MaxPool2d(2, 2), # 14 * 14
layer.Conv2d(channels, channels, kernel_size=3, padding=1, bias=False),
layer.BatchNorm2d(channels),
neuron.IFNode(surrogate_function=surrogate.ATan()),
layer.MaxPool2d(2, 2), # 7 * 7
layer.Flatten(),
layer.Linear(channels * 32 * 32, channels * 4 * 4, bias=False),
neuron.IFNode(surrogate_function=surrogate.ATan()),
layer.Linear(channels * 4 * 4, 10, bias=False),
neuron.IFNode(surrogate_function=surrogate.ATan()),
)
functional.set_step_mode(self, step_mode='m')
def __len__(self):
return len(self.conv_fc)
def forward(self,x):
x_seq = x.unsqueeze(0).repeat(self.T, 1, 1, 1, 1) # [N, C, H, W] -> [T, N, C, H, W]
x_seq = self.conv_fc(x_seq)
fr = x_seq.mean(0)
return fr
训练过程为: start_epoch = 0 instances_stdp = (la.Conv2d,) params_stdp = []
for m in net.modules():
print('m : ',m)
if isinstance(m, instances_stdp):
print('instances_stdp : ',instances_stdp)
for p in m.parameters():
params_stdp.append(p)
params_stdp_set = set(params_stdp)
params_gradient_descent = []
for p in net.parameters():
if p not in params_stdp_set:
params_gradient_descent.append(p)
optimizer_gd = Adam(params_gradient_descent, lr=0.1)
optimizer_stdp = SGD(params_stdp, lr=0.1, momentum=0.8)
stdp_learners = []
for i, layer in enumerate(net.conv_fc):
if isinstance(layer, instances_stdp):
stdp_learners.append(
learning.STDPLearner(step_mode=args.step_mode, synapse=layer,
sn=net.conv_fc[i + 1],
tau_pre=2.,
tau_post=2.,
f_pre=f_weight, f_post=f_weight)
)
net.to(args.device)
for epoch in range(start_epoch, args.epochs):
start_time = time.time()
net.train()
for i in range(stdp_learners.__len__()):
stdp_learners[i].enable()
train_loss = 0
train_acc = 0
train_samples = 0
for img, label in train_loader:
optimizer_gd.zero_grad()
optimizer_stdp.zero_grad()
img = img.to(args.device)
label = label.to(args.device)
label_onehot = F.one_hot(label, 10).float()
out_fr = net(img)
loss = F.mse_loss(out_fr, label_onehot)
loss1 = loss.detach_().requires_grad_(True)
loss1.backward()
# stdp
optimizer_stdp.zero_grad()
for i in range(stdp_learners.__len__()):
stdp_learners[i].step(on_grad=True)
optimizer_gd.step()
optimizer_stdp.step()
for i in range(stdp_learners.__len__()): # clean the record
stdp_learners[i].reset()
train_samples += label.numel()
train_loss += loss1.item() * label.numel()
train_acc += (out_fr.argmax(1) == label).float().sum().item()
torch.cuda.empty_cache()
functional.reset_net(net)
train_loss /= train_samples
train_acc /= train_samples
print('epoch : ',epoch,' ; ','train_loss : ',train_loss,' ; ','train_acc : ',train_acc)
建议把STDP学习率设置成0,先看看只用GD时网络是否收敛,检查一下网络是否正确训练
将 optimizer_stdp = SGD(params_stdp, lr=0., momentum=0.)中lr设为0 输出结果为: epoch : 0 ; train_loss : 0.10000000149011612 ; train_acc : 0.10852148579752367 epoch : 1 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 2 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 3 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 4 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 5 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 6 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469
修改为使用optimizer 进行训练,就能够正常分类,修改后的代码如下: optimizer = torch.optim.SGD(net.parameters(), lr=args.lr, momentum=args.momentum) lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, args.epochs) for epoch in range(start_epoch, args.epochs): start_time = time.time() net.train() for i in range(stdp_learners.len()): stdp_learners[i].enable() train_loss = 0 train_acc = 0 train_samples = 0 for img, label in train_loader: optimizer.zero_grad() img = img.to(args.device) label = label.to(args.device) label_onehot = F.one_hot(label, 10).float() out_fr = net(img) loss = F.mse_loss(out_fr, label_onehot) loss.backward() optimizer.step() train_samples += label.numel() train_loss += loss.item() * label.numel() train_acc += (out_fr.argmax(1) == label).float().sum().item() torch.cuda.empty_cache() functional.reset_net(net)
train_loss /= train_samples
train_acc /= train_samples
print('epoch : ',epoch,' ; ','train_loss : ',train_loss,' ; ','train_acc : ',train_acc)
输出如下: epoch : 0 ; train_loss : 0.12450837841269663 ; train_acc : 0.13000728332119446 epoch : 1 ; train_loss : 0.13798252292362687 ; train_acc : 0.1540422432629279 epoch : 2 ; train_loss : 0.12840495524196754 ; train_acc : 0.1540422432629279
本实验用的 CSNN网络在我的数据集上最高可以达到95%的准确率,网络应该是没有问题的,但不知道在利用STDP进行训练时出了什么问题,导致无法分类
STDP作为无监督的学习器,是不保证使用后性能能增加的。如果纯GD训练没有问题,那就得慢慢调试STDP的参数了
将 optimizer_stdp = SGD(params_stdp, lr=0., momentum=0.)中lr设为0 输出结果为: epoch : 0 ; train_loss : 0.10000000149011612 ; train_acc : 0.10852148579752367 epoch : 1 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 2 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 3 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 4 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 5 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469 epoch : 6 ; train_loss : 0.10000000149011612 ; train_acc : 0.1088856518572469
修改为使用optimizer 进行训练,就能够正常分类,修改后的代码如下: optimizer = torch.optim.SGD(net.parameters(), lr=args.lr, momentum=args.momentum) lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, args.epochs) for epoch in range(start_epoch, args.epochs): start_time = time.time() net.train() for i in range(stdp_learners.len()): stdp_learners[i].enable() train_loss = 0 train_acc = 0 train_samples = 0 for img, label in train_loader: optimizer.zero_grad() img = img.to(args.device) label = label.to(args.device) label_onehot = F.one_hot(label, 10).float() out_fr = net(img) loss = F.mse_loss(out_fr, label_onehot) loss.backward() optimizer.step() train_samples += label.numel() train_loss += loss.item() * label.numel() train_acc += (out_fr.argmax(1) == label).float().sum().item() torch.cuda.empty_cache() functional.reset_net(net)
train_loss /= train_samples train_acc /= train_samples print('epoch : ',epoch,' ; ','train_loss : ',train_loss,' ; ','train_acc : ',train_acc)
输出如下: epoch : 0 ; train_loss : 0.12450837841269663 ; train_acc : 0.13000728332119446 epoch : 1 ; train_loss : 0.13798252292362687 ; train_acc : 0.1540422432629279 epoch : 2 ; train_loss : 0.12840495524196754 ; train_acc : 0.1540422432629279
本实验用的 CSNN网络在我的数据集上最高可以达到95%的准确率,网络应该是没有问题的,但不知道在利用STDP进行训练时出了什么问题,导致无法分类
您好,我也遇到了同样的问题,请问您解决了吗?
我想再次强调,这是STDP的feature,不是bug😂
STDP本来在深度SNN上就是不work的算法,即使STDP本身的实现是正确的,教程里只是展示如何使用。
我刚刚也遇到了这个问题,看来只能先不用stdp了