ModelDiff
ModelDiff copied to clipboard
About
This is the artifact associated with our paper "ModelDiff: Testing-based DNN Similarity Comparison for Model Reuse Detection".
ModelDiff is a testing-based approach to deep learning model similarity comparison. Instead of directly comparing the weights, activations, or outputs of two models, ModelDiff compares their behavioral patterns on the same set of test inputs. Specifically, the behavioral pattern of a model is represented as a decision distance vector (DDV), in which each element is the distance between the model's reactions to a pair of inputs. The knowledge similarity between two models is measured with the cosine similarity between their DDVs.
Environment
- Ubuntu 16.04
- CUDA 10.0
Dependencies
- PyTorch 1.5.0
- TorchVision 0.6.0
- AdverTorch 0.2.0
Get start
- You should have a GPU on your device because the adversarial sample computation is pretty slow
- You should first install CUDA 10.2 on your device (if you don't have) from here
- Install Anaconda and create a new environment and enter the environment
conda create --name modeldiff python=3.6
- Install pytorch in the new environment
conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.2 -c pytorch
- Install AdvTorch
pip install advertorch
- Install other packages
pip install scipy
- Make a new directory called
dataand Download all three datasets listed below in thedatadirectory
data\
|--- CUB_200_2011/
|--- stanford_dog/
|--- MIT_67/
Prepare dataset
Caltech-UCSD 200 Birds
Layout should be the following for the dataloader to load correctly
CUB_200_2011/
| README
| bounding_boxes.txt
| classes.txt
| image_class_labels.txt
| images.txt
| train_test_split.txt
|--- attributes
|--- images/
|--- parts/
|--- train/
|--- test/
Stanford 120 Dogs
stanford_dog/
| file_list.mat
| test_list.mat
| train_list.mat
|--- train/
|--- test/
|--- Images/
|--- Annotation/
MIT 67 Indoor Scenes
MIT_67/
| TrainImages.txt
| TestImages.txt
|--- Annotations/
|--- Images/
|--- test/
|--- train/
Prepare models
You can change the size of the benchmark and the number of models to use in benchmark.py. The models used in the paper are MobileNetV2 and ResNet18 trained on Flower102 and StanfordDogs120 datasets. You can add other architectures and datasets the ImageBenchmark class of benchmark.py (line 487 to line 503 as following).
# Used in the paper
self.datasets = ['Flower102', 'SDog120']
self.archs = ['mbnetv2', 'resnet18']
# Other archs
# self.datasets = ['MIT67', 'Flower102', 'SDog120']
# self.archs = ['mbnetv2', 'resnet18', 'vgg16_bn', 'vgg11_bn', 'resnet34', 'resnet50']
# For debug
# self.datasets = ['Flower102']
# self.archs = ['resnet18']
We also provide the benchmark used in the paper and you can download it from google drive.
Evaluation
The code to compare DDV (decision distance vector) model similarity is in evaluate.ipynb. It loads the benchmark models from benchmark.py and compare similarity.