MIPVerify.jl icon indicating copy to clipboard operation
MIPVerify.jl copied to clipboard

Performance on CNN

Open rnbguy opened this issue 5 years ago • 7 comments

How does it perform on a CNN? I have a .mat file containing the weights for a standard MNIST CNN classifier.

cnn = Sequential([
        MIPVerify.get_conv_params(w_dict, "00-Conv2D", (3,3,1,32)),
        ReLU(),
        MIPVerify.get_conv_params(w_dict, "01-Conv2D", (3,3,32,64)),
        ReLU(),
        Flatten(4),
        MIPVerify.get_matrix_params(w_dict, "04-Dense", (50176, 128)),
        ReLU(),
        MIPVerify.get_matrix_params(w_dict, "06-Dense", (128, 10))
    ], "mnist_cnn.n2")

d = MIPVerify.find_adversarial_example(cnn, MIPVerify.get_image(mnist.train.images, 1), 10, GurobiSolver(), pp=MIPVerify.LInfNormBoundedPerturbationFamily(.04))

But it will finish after more than 10 days.

  Calculating upper bounds: 100%|███████████████████████| Time: 0:00:08
  Calculating lower bounds: 100%|███████████████████████| Time: 0:00:06
  Imposing relu constraint: 100%|███████████████████████| Time: 0:00:05
  Calculating upper bounds:   0%|                       |  ETA: 11.14 days

It is mentioned in the paper that the tool finished for the whole test set in ~10 hours. Am I doing it wrong?

PS. I don't know if it matters, I am using a Jupyter notebook.

rnbguy avatar Sep 04 '19 13:09 rnbguy

Was your network trained with a robustness objective? (Knowing this will help me debug the issue).

The performance of the classifier is sensitive to the number of unstable ReLUs (as outlined in Section 5.2.1 of the paper) rather than the structure of the network. Anecdotally, we observed that networks trained without a robustness objective had many unstable ReLUs; necessitating the introduction of a large number of binary variables; as such, solve times tend to be slow for these networks.

The step that the code is currently stuck on is determining bounds (outlined in section 4.2) on the input to each rectified linear unit. You could try passing the command tightening_algorithm=interval_arithmetic to find_adversarial_example; this should speed up computation of bounds, but the main solve process to find adversarial examples will still likely be slow. Let me know what you see.

(Using a Jupyter notebook shouldn't change anything --- I've used them too and not seen a slowdown in solve times.)

vtjeng avatar Sep 07 '19 23:09 vtjeng

I tried with tightening_algorithm=interval_arithmetic on a smaller CNN. That determining-bounds step is finished, but then it is stuck at solving.

[notice | MIPVerify]: Attempting to find adversarial example. Neural net predicted label is 4, target labels are [10]
[notice | MIPVerify]: Rebuilding model from scratch. This may take some time as we determine upper and lower bounds for the input to each non-linear unit.
  Calculating upper bounds: 100%|███████████████████████| Time: 0:00:00
  Calculating lower bounds: 100%|███████████████████████| Time: 0:00:00
  Imposing relu constraint: 100%|███████████████████████| Time: 0:00:00
  Calculating upper bounds: 100%|███████████████████████| Time: 0:00:00
  Calculating lower bounds: 100%|███████████████████████| Time: 0:00:00
  Imposing relu constraint: 100%|███████████████████████| Time: 0:00:00
[notice | MIPVerify]: The model built will be cached and re-used for future solves, unless you explicitly set rebuild=true.

It is like this for at least 30 minutes. This was the Keras model.

model = Sequential()
model.add(Conv2D(2, (3, 3),
                 activation='relu',
                 padding='same',
                 input_shape=input_shape))
model.add(Conv2D(5, (2, 2),
                 padding='same',
                 activation='relu'))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

Also, I was not using a CNN training with any robustness objective. But I was going to do that eventually. So I tried training the same model in Cleverhans which has ~97% accuracy on both legitimate and adversarial examples. But that one is stuck as well.

rnbguy avatar Sep 09 '19 09:09 rnbguy

For networks trained without a robustness objective, here's a few other things you can try:

  • Try a smaller bounded norm just to see that everything is working --- for example, LInfNormBoundedPerturbationFamily(0.005). (A large proportion of samples should have adversarial examples even within this small bounded norm).
  • Try the methods outlined in this paper to improve verifiability: https://arxiv.org/pdf/1809.03008.pdf. An easy first step is to apply l1-regularization on the weights; we've observed that this helps with verifiability (and also improves robustness)

What adversarial examples was the Cleverhans model trained to be robust to? (It would be important to know if these adversarial examples had a bounded l1 norm, l2 norm, l infinity norm or something else entirely)

(copied over from previous comment because that was not formatting well)

vtjeng avatar Sep 10 '19 06:09 vtjeng

I tried with

d = MIPVerify.find_adversarial_example(cnn, MIPVerify.get_image(mnist.train.images, 1), 10, GurobiSolver(), pp=MIPVerify.LInfNormBoundedPerturbationFamily(.005), tightening_algorithm=interval_arithmetic)

But it throws errors,

[notice | MIPVerify]: Attempting to find adversarial example. Neural net predicted label is 6, target labels are [10]
[notice | MIPVerify]: Rebuilding model from scratch. This may take some time as we determine upper and lower bounds for the input to each non-linear unit.
  Calculating upper bounds: 100%|███████████████████████| Time: 0:00:00
  Calculating lower bounds: 100%|███████████████████████| Time: 0:00:00
  Imposing relu constraint: 100%|███████████████████████| Time: 0:00:00
  Calculating upper bounds: 100%|███████████████████████| Time: 0:00:00
  Calculating lower bounds: 100%|███████████████████████| Time: 0:00:00
  Imposing relu constraint: 100%|███████████████████████| Time: 0:00:00
[notice | MIPVerify]: The model built will be cached and re-used for future solves, unless you explicitly set rebuild=true.
SystemError: opening file "/tmp/julia/MIPVerify/models/mnist_cnn.n2.(1, 28, 28, 1).linf-norm-bounded-0.005.10490436085773664469.jls": No such file or directory

Stacktrace:
 [1] #open#312(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::getfield(MIPVerify, Symbol("##52#54")), ::String, ::Vararg{String,N} where N) at ./iostream.jl:373
 [2] open at ./iostream.jl:373 [inlined]
 [3] get_reusable_model(::Sequential, ::Array{Float64,4}, ::MIPVerify.LInfNormBoundedPerturbationFamily, ::GurobiSolver, ::MIPVerify.TighteningAlgorithm, ::Bool, ::Bool) at /home/ranadeep/.julia/packages/MIPVerify/2slfm/src/models.jl:150
 [4] get_model at /home/ranadeep/.julia/packages/MIPVerify/2slfm/src/models.jl:79 [inlined]
 [5] macro expansion at /home/ranadeep/.julia/packages/MIPVerify/2slfm/src/MIPVerify.jl:119 [inlined]
 [6] macro expansion at ./util.jl:213 [inlined]
 [7] #find_adversarial_example#71(::Bool, ::MIPVerify.LInfNormBoundedPerturbationFamily, ::Int64, ::Float64, ::Bool, ::MIPVerify.TighteningAlgorithm, ::GurobiSolver, ::Bool, ::Bool, ::MIPVerify.AdversarialExampleObjective, ::typeof(find_adversarial_example), ::Sequential, ::Array{Float64,4}, ::Int64, ::GurobiSolver) at /home/ranadeep/.julia/packages/MIPVerify/2slfm/src/MIPVerify.jl:103
 [8] (::getfield(MIPVerify, Symbol("#kw##find_adversarial_example")))(::NamedTuple{(:pp, :tightening_algorithm),Tuple{MIPVerify.LInfNormBoundedPerturbationFamily,MIPVerify.TighteningAlgorithm}}, ::typeof(find_adversarial_example), ::Sequential, ::Array{Float64,4}, ::Int64, ::GurobiSolver) at ./none:0
 [9] top-level scope at In[7]:1

Anyway, thanks for the advice. I will check your paper.

For Cleverhans's adversarial training, I copy-pasted from mnist_tutorial_keras_tf.py except the model, which I constructed manually

rnbguy avatar Sep 10 '19 12:09 rnbguy

Hi, Thank you for making the code available. I'm not sure if I should start a new issue, but I also have problem with CNN. I am running a model similar to your CNN_B on CIFAR10. However I ran into issue with all 3 types of tightening_algorithm (I'm talking about 1 sample and 1 target):

  1. mip: Starting from the 3rd upper/lower computation, it takes ~8hrs.
  2. LP: 3rd upper computation takes ~1hr.
  3. interval: My code hangs.

This is how I test my model before proceed to batch mode:

MIPVerify.find_adversarial_example(nn, sample_image, 2, 
    GurobiSolver(), pp=MIPVerify.LInfNormBoundedPerturbationFamily(2/255),
    cache_model=false, tightening_algorithm=MIPVerify.TighteningAlgorithm(2))

I tried to figure it out as much as I could, but thus far I can't make it run faster so I can verify the whole CIFAR10 test set. If you can, could you point out where did I get it wrong and if there is anything I can do to make the verification faster? I tried with a very sparse NN but I couldn't make it faster. Thank you in advance for your help!

gear avatar May 31 '20 13:05 gear

@gear, continuing the conversation in this thread works well.

A few follow-up questions:

  • When you say that your code hangs with interval, what is the last log line that you get stuck on?
  • What robustness objective was your network trained with?
  • How did you obtain the sparse NN? Did you post-process a network (e.g. by removing small weights), train with an explicit objective to induce sparsity in the weights, or something else?

The MIPVerify package anecdotally (and perhaps counterintuively) has longer solve times when the network being verified isn't robust. As a sanity check, would you run a strong first-order attack on your network to see what proportion of samples an attack can be found for within 2/255? You can start with Madry Lab's robustness package implements the PGD attack; they have a convenient CLI here.


Sidenote:

Target Selection

I noticed that you're calling find_adversarial_example (see example, docs) with target_selection = 2. This means that we attempt to perturb the input image such that the label of the output classification is 2. Was that what you intended? If not, you might want to use the key invert_target_selection.

Tigthening Algorithm

You can directly use the names of the enums https://github.com/vtjeng/MIPVerify.jl/blob/b6a585dab860b64e15717d5c9219b705b62fea0c/src/MIPVerify.jl#L18

Rather than tightening_algorithm=MIPVerify.TighteningAlgorithm(2)), you would have tightening_algorithm=mip.

vtjeng avatar Jun 01 '20 01:06 vtjeng

@vtjeng, to answer your questions:

  • The last log-line the code hangs on was one of the progress bar saying smt like below. FYI, the processes for MIPVerify still run when this happened, but I don't see any output. Should I be more patient? I expect LP to be fast (as in your paper, 22 sec mean solve time) so I kill the process after 15 min of no log output.
[notice | MIPVerify]: Attempting to find adversarial example. Neural net predicted label is 4, target labels are [2]
[notice | MIPVerify]: Rebuilding model from scratch. This may take some time as we determine upper and lower bounds for the input to each non-linear unit.
  Calculating upper bounds:   0%|                       |  ETA: 0:10:07
Academic license - for non-commercial use only
  Calculating upper bounds: 100%|███████████████████████| Time: 0:00:03
  Calculating lower bounds: 100%|███████████████████████| Time: 0:00:00
  Imposing relu constraint: 100%|███████████████████████| Time: 0:00:00
  Calculating upper bounds: 100%|███████████████████████| Time: 0:00:08
  • In this experiment I didn't use the robustness objective, just a normal convolution because I am working on an other aspect of NN. However, the somewhat similar situation occurs when I use the model trained from "provable" paper by Wong and Kolter (ICML 2018 and NeurIPS 2019). In the case of robust training, eps = 2/255 and the model is only fully connected. I have limited resource working from home so training provable robust CNN is not available to me right now.
  • Sparse NN is obtained by the naive magnitude pruning and fine-tuning. This network is definitely not robust; so as you pointed out, maybe it was the case where MIPVerfiy takes long time to solve. I will try the sanity check as you suggested. Thanks!

Target selection

It was my intention to use targets = [10] / {4} to verify all outputs and I want to know if there exists an attack for each specific label. I also want to confirm invert_target_selection means the code runs for targets {1,2,3,5,6,7,8,9,10} if the NN output is {4} then the model is "solvable" when there exists an attack which makes the output differs from {4}? Is this setting makes MIPVerify faster? If so, how? Thank you for pointing out this parameter :).

Tightening Algorithm

Thanks for pointing that out! I work mostly with Python so Julia's namespace feels a bit "unsafe" to me ^^.

Thanks for your quick reply!

gear avatar Jun 01 '20 02:06 gear

Closing this issue as we haven't had any new updates in a while. Feel free to update if you had any additional questions!

vtjeng avatar Jan 07 '24 21:01 vtjeng