CSD icon indicating copy to clipboard operation
CSD copied to clipboard

Implementation and datasets for Efficient Domain Generalization via Common-Specific Low-Rank Decomposition (https://arxiv.org/abs/2003.12815)

This folder contains code accompanying the submission: [[https://arxiv.org/abs/2003.12815][Efficient Domain Generalization via Common-Specific Low-Rank Decomposition]] (ICML 2020).

Our proposed technique: Common Specific Decomposition (CSD) is extremely simple and can be easily incorporated in to your codestack by replacing the final layer with CSD. You may find the this isolated code nuggets of CSD for [[https://gist.github.com/vihari/bad9868049ef62db783e0fc11b22bb5c][TensorFlow]] and [[https://gist.github.com/vihari/0dc2c296e74636725cfee364637fb4f7][PyTorch]] handy.

However, if you wish to reproduce numbers from our paper, we highly encourage that you use our codestack.

The folders are organized into

  1. rotation: Rotation tasks on MNIST and Fashion-MNIST
  2. hw: Hand-written tasks on LipitK and Nepali Character recognition datasets
  3. pacs: PACS evaluation with ResNet18 and AlexNet
  4. speech: Speech utterance classification evaluation

To be able to run all the experiments, you need the following packages for Python3.6

  1. Tensorflow <= 1.16 >1.13 for rotation, speech and hand-written experiments
  2. PyTorch >=1.0 for PACS
  3. Keras for loading datasets in rotation tasks.

The data for hand-written and rotation tasks are either provided or scripted to automatically download, however the PACS and speech dataset are to be downloaded and configured in order to run the provided code.

The following sections give more specific instructions on how to run the code.

  • Rotation tasks CSD: #+BEGIN_SRC python main.py --dataset mnist --classifier mos #+END_SRC ERM: #+BEGIN_SRC python main.py --dataset mnist --classifier simple #+END_SRC

  • Hand-written tasks

** LipitK Task #+BEGIN_SRC python hw_train_and_test.py --dataset lipitk --num_train #+END_SRC ** NHCD Task #+BEGIN_SRC python hw_train_and_test.py --dataset nhcd --lr 1e-4 #+END_SRC

Use flags: /--simple/ or /--cg/ for ERM or CG baselines.

  • PACS CSD is coded in to the implementation of [[https://github.com/fmcarlucci/JigenDG][JigenDG]], a previous domain generalizing approach.

Use the following steps in order to run experiments.

  1. Download and configure the [[https://domaingeneralization.github.io/][PACS dataset]] such that all the paths in /pacs/data/txt_lists/ are appropriate.
  2. Use /pacs/run.sh/ either to run PACS evaluation using AlexNet or ResNet-18.
  • Speech task You need to download Speech dataset and extract it in to /speech/speech_dataset// folder. It can be downloaded from: [[http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz][http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz ]]

You can then CSD using the follwing command. #+BEGIN_SRC $ cd speech; $ python train.py --training_percentage --train_dir= --model mos2 --seed 0 --learning_rate 2e-3 --how_many_epochs 500 --lmbda 0.5 --num_uids=<2(K)> #+END_SRC

Feel free to open an issue or write to me if you find something missing or confusing.