advertorch CarliniWagnerL2Attack on MNIST does NOT work

CarliniWagnerL2Attack on MNIST does NOT work

Open wenhaoyong opened this issue 3 years ago • 1 comments

Thanks for this awesome toolbox. When I try to attack MNIST using the CarliniWagnerL2Attack, the test results indicated that the attack was not successful. Here comes the code:

    testset = torchvision.datasets.MNIST(root='./dataset', train=False, download=True, transform=transform_test)
    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=4)

    cw_attack = CarliniWagnerL2Attack(predict=target_model,
                                      num_classes=10,
                                      confidence=2.0,
                                      targeted=True,
                                      learning_rate=0.001,
                                      binary_search_steps=5,
                                      max_iterations=1000,
                                      abort_early=True,
                                      clip_min=0.0,
                                      clip_max=1.0)

    # construct adversarial samples
    for i, data in enumerate(testloader, 0):
        x, y = data
        x, y = x.to(device), y.to(device)
        y_pred = target_model(x).argmax(dim=1)
        print("y_pred:", y_pred)

        # Random target construction
        if y.size() != torch.Size([]):
            range_ = y.size()[0]
        else:
            range_ = 1
        targets = []
        for index in range(range_):
            target = randint(0, 9)
            while target == y[index].item():
                target = randint(0, 9)
            targets.append(target)
            attack_target = torch.tensor(targets).to(device)
        print("attack_target:", attack_target)     

        # C&W
        with ctx_noparamgrad_and_eval(target_model):
            x_adv = cw_attack.perturb(x, attack_target)
        y_pred_adv = target_model(x_adv).argmax(dim=1)
        print("y_pred_adv:", y_pred_adv)
        raise Exception

And the results were:

y_pred: tensor([7, 2, 1, 0, 4, 1, 4, 9, 5, 9, 0, 6, 9, 0, 1, 5], device='cuda:0')
attack_target: tensor([6, 8, 3, 7, 8, 2, 5, 6, 0, 2, 2, 8, 4, 6, 2, 2], device='cuda:0')
y_pred_adv: tensor([7, 2, 1, 0, 4, 1, 4, 9, 5, 9, 0, 6, 9, 0, 1, 5], device='cuda:0')

The pred labels after the CW attack are the same as before. Any tips would be appreciated.

Jan 23 '22 04:01 wenhaoyong

Hi HauChung, Depending on the normalization of the inputs, you may want to adjust the clip_min and clip_max (e.g., [-1, 1]) and increase the learning rate.

Feb 06 '22 17:02 masoudhashemi

advertorch advertorch copied to clipboard

CarliniWagnerL2Attack on MNIST does NOT work

advertorch
advertorch copied to clipboard