Chen Ma
Chen Ma
Maybe the low successful rate is that because I use pytorch 0.4 and python3.6 version, which is not your tested version? because pytorch 0.4 eliminate the Variable class and modify...
Does your test FGSM is the single-step version or Iterative steps version(also called Basic Iterative Method(BIM))? Because the original single-step version FGSM is perform worse than its multiple steps version.
@jimo17 (1) No, they are different loss functions. The loss function used in training the simulator is the Mean-Square-Error(MSE) loss, the loss function used in attacking is the logits-difference loss...
@bambooboy 你说的是哪个参数?
@gzliyan113 I think this implementation is not Reptile, but the MAML. As you can see MAML's update is based on the sum of gradients over all tasks.
I simply modify code's config setting: translated_img_size = 100, sensorBandwidth = 8, minRadius = 8 I want to see how code's performence in 100x100 translated image? but the code converge...
@siahuat0727 I possibly found a bug: In the original paper, the author says there are two networks, the original network and the pruned one. But your code seems has just...
@siahuat0727 I think in order to solve this problem, you need to freeze these pruned channel inside kernel after pruning, or create another pruned network and copy the original network's...
@siahuat0727 I still think this step has problem, because you use ``` with torch.no_grad(): for name, W in conv_weights: W.grad[mask[name]] = 0 ``` but you freeze all the weights of...
Furthermore, the BN layer will also be affected by the channel number, and you use all the channels to train, BN layer also won't work properly.