Wenbin Li comments

Results 10 comments of


                                            Wenbin Li

Does this DN4 network contains a pre-triaing stage?

Thanks. No, DN4 does not need a pre-training stage and it is just trained from scratch in this code. By the way, since pre-training is a general trick, you can...

Does this DN4 network contains a pre-triaing stage?

You are welcome. As mentioned in our paper, one key reason is that we employ the much richer non-summarized local representations to represent both the query image and support class....

Does this DN4 network contains a pre-triaing stage?

It's my pleasure. I am glad that DN4 works on your dataset! Yes, Transformer is a good choice. Unfortunately, I don't have much good suggestions or experiences on this part....

Does this DN4 network contains a pre-triaing stage?

Yes, it's a normal situation. Because DN4 use a Sum operation to aggregate all the local similarities for a query image, this will make the similarity list flat. Fortunately, the...

Need for help

Sorry for the late reply. Do you still need the implementation of prototypical network?

训练阶段的几个问题

您好，关于第一个问题，是的，在训练阶段进行了测试，方便观察模型的效果；关于第二个问题，这样设置的目的是希望模型训练完一个Epoch之后，固定模型BN层的参数，提高模型的泛化能力，是一个小的trick，当然你可以去掉这个设置，但是需要在train代码里把模型的模式改成model.train()。

EpisodeSize代表训练时few-shot task的个数，从task维度理解，batch_size=1，但是如果从sample维度理解，batch_size=support samples + query samples; 改变它的大小对训练有一定的影响，看你是增加task的个数，还是说只是改变query samples的个数，一般不会改变support set的大小。谢谢！ --------------------------------------------------- Wenbin Li, PhD. Assistant Professor Department of Computer Science and Technology State Key Laboratory for Novel Software Technology Nanjing...

准确率问题

您好，在dog和bird数据集上是否存在严重的过拟合现象呢？验证集结果和测试集结果的差异可能是固定BN造成的，您看看是否可以把CovaMNet_Train_5way1shot.py 里386行的model.eval() 换成model.train()，看看新的结果如何。另外，由于dog和bird数据比较少，可能比较容易过拟合，其实应该做一下数据增广。

关于权重ω

您好，这个问题问的非常好，我们其实是采用了一个比较巧妙的方式来实现这个可学习权重w, 我们在计算查询图像Q与某个类别的相似度的时候，假设Q有m个局部相似度，我们直接把这m个相似度存下来，放在mea_sim里，它的维度是"类别数量*m"，然后在self.classifier里我们采用了一个Conv1d来，使得Kernel size和stride都等于m，即论文中的441，通过这种方式就自动学习了w nn.Conv1d(1, 1, kernel_size=441, stride=441, bias=use_bias)

关于文章中数据的疑问

您好，Table 2里是完全按照原文的代码里的设置和使用各种trick，但是Table 3是尽可能让所有方法使用差不多的设置和相同的训练trick，包括训练的轮数等，主要是这个原因造成的，这样可能也不是非常公平公正，但已经是相对合理的了。