DatasetFactorization icon indicating copy to clipboard operation
DatasetFactorization copied to clipboard

PyTorch implementation of paper "Dataset Distillation via Factorization" in NeurIPS 2022.

Dataset Factorization

This is the pytorch implementation of the following NeurIPS 2022 paper:

Dataset Distillation via Factorization

Songhua Liu, Kai Wang, Xingyi Yang, Jingwen Ye, and Xinchao Wang.

Installation

  • Create a new environment if you want:

    conda create -n HaBa python=3.8
    conda activate HaBa
    
  • Clone the repo and install the required packages:

    git clone https://github.com/Huage001/DatasetFactorization.git
    cd DatasetFactorization
    pip install -r requirements.txt
    

Dataset Distillation

  • Install required packages:

    pip install -r requirements.txt
    
  • First, generate buffer of training trajectories using:

    python buffer.py --dataset=CIFAR10 --model=ConvNet --train_epochs=50 --num_experts=100 --zca --buffer_path={path_to_buffer_storage} --data_path={path_to_dataset}
    
  • Then, edit run_cifar10_ipc[xx]_style5.sh. Change {path_to_buffer_storage} to your path of buffers and {path_to_dataset} to your path of datasets.

  • Run:

    bash run_cifar10_ipc[xx]_style5.sh
    

    [xx] can be 1, 10, or 50.

  • Most of hyper-parameters are following the baseline repo. You may also try other configurations of arguments in the .sh files freely.

  • distill.py contains the original implementation of the baseline method MTT for comparison.

Acknowledgement

This code borrows heavily from mtt-distillation and DatasetCondensation.

Citation

If you find this project useful in your research, please consider cite our paper and the default baseline method:

@article{liu2022dataset,
    author    = {Songhua Liu, Kai Wang, Xingyi Yang, Jingwen Ye, Xinchao Wang},
    title     = {Dataset Distillation via Factorization},
    journal   = {NeurIPS},
    year      = {2022},
}
@inproceedings{
cazenavette2022distillation,
title={Dataset Distillation by Matching Training Trajectories},
author={George Cazenavette and Tongzhou Wang and Antonio Torralba and Alexei A. Efros and Jun-Yan Zhu},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2022}
}