ECoFLaP
ECoFLaP copied to clipboard
Code for "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models" (ICLR 2024)
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models (ICLR 2024)
- Authors: Yi-Lin Sung, Jaehong Yoon, Mohit Bansal
- Paper: "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models"
- Project Page
We propose ECoFLaP, a two-stage coarse-to-fine weight pruning approach for Large Vision-Language Models (LVLMs). We first determine the sparsity ratios of different layers or blocks by leveraging the global importance score, which is efficiently computed based on the zeroth-order approximation of the global model gradients. Then, the multimodal model performs local layer-wise unstructured weight pruning based on the given ratios.
We validate our proposed method across various multimodal and unimodal models and datasets, demonstrating significant performance improvements over prevalent pruning techniques in the high-sparsity regime.

Changelog
- [Feb 2024] Checkpoints are added
Checkpoints
BLIP2
Sparsities are all 0.5
| Wanda | ECoFLaP first-order | ECoFLaP zeroth-order |
|---|---|---|
| Ckpt | Ckpt | Ckpt |
FlanT5 XL
| Wanda | ECoFLaP first-order | ECoFLaP zeroth-order |
|---|---|---|
| Ckpt | Ckpt | Ckpt |
ViT b/16
| Wanda | ECoFLaP first-order | ECoFLaP zeroth-order |
|---|---|---|
| Ckpt | Ckpt | Ckpt |
CLIP
Sparsities are all 0.4
| Wanda | SparseGPT | ECoFLaP w/ Wanda | ECoFLaP w/ SparseGPT |
|---|---|---|---|
| Ckpt | Ckpt | Ckpt | Ckpt |
BLIP
Sparsities are all 0.5
| Dataset | Wanda | ECoFLaP | ECoFLaP w/ fine-tuning |
|---|---|---|---|
| VQA | Ckpt | Ckpt | Ckpt |
| NLVR2 | Ckpt | Ckpt | Ckpt |
| Flickr | Ckpt | Ckpt | Ckpt |
| COCO Caption | Ckpt | Ckpt | Ckpt |
- Some additional results regarding BLIP models
| Methods | VQA (test dev) | Flickr30k (TR@1/IR@1) | NLVR2 (val/text) | COCO Cap. (CIDEr/SPICE) |
|---|---|---|---|---|
| Full model | 77.4 | 96.8/86.9 | 82.3/83.6 | 133.3/23.8 |
| Wanda (w/o fine-tuning) | 71.9 | 85.3/72.3 | 78.3/78.1 | 97.1/18.4 |
| ECoFLaP (w/o fine-tuning) | 73.6 | 90.2/79.5 | 79.1/79.2 | 111.0/20.3 |
| UPop (w/ fine-tuning) | 76.3 | 94.0/82.0 | 80.3/81.1 | 128.9/23.3 |
| ECoFLaP (w/ fine-tuning) | 76.7 | 96.8/85.6 | 81.8/82.5 | 132.3/23.8 |
BLIP-2, FlanT5, ViT experiment scripts
The main code for this part is in LAVIS/. Please do everything in LAVIS/ by cd LAVIS/.
Installation
pip install -e .
Dataset
Follow the scripts in lavis/datasets/download_scripts/ to download the datasets.
BLIP-2 Scripts
## BLIP-2 experiments
# ECoFLaP - zeroth order
python scripts/blip2/ecoflap_zeroth.py 0 12341
# ECoFLaP - first order
python scripts/blip2/ecoflap_first.py 0 12341
# Wanda
python scripts/blip2/wanda.py 0 12341
# SparseGPT
python scripts/blip2/sparsegpt.py 0 12341
ViT Scripts
# ECoFLaP - zeroth order
python scripts/eva_clip/ecoflap.py 0 12341
# Wanda
python scripts/eva_clip/wanda.py 0 12341
FlanT5 Scripts
### Generate the pruned checkpoint
# ECoFLaP - zeroth order
python scripts/t5/ecoflap.py 0 12341
### Do the five-shot evaluation
# go to the mmlu_eval folder
cd ../mmlu_eval
# Make sure to assign pruned_checkpoint to the checkpoint generated in the previous step
bash test.sh
CLIP experiments
The main code for this part is in CoOp/. Please do everything in CoOp/ by cd CoOp/.
Installation
pip install -r requirements.txt
Dataset
Follow the scripts in DATASETS.md to download the datasets.
Scripts
# Wanda and ECoFLaP (w/ Wanda)
bash scripts/coop/ecoflap_wanda.sh
# SparseGPT and ECoFLaP (w/ SparseGPT)
bash scripts/coop/ecoflap_sparsegpt.sh
BLIP experiments to compare with UPop
The main code for this part is in UPop/. Please do everything in UPop/ by cd UPop/.
Installation
pip install -r requirements.txt
Dataset
Follow the scripts in README.md to download the datasets.
Scripts
### task=coco, flickr, nlvr2, vqa
# Wanda
bash ecoflap_scripts/${task}/wanda.sh
# ECoFLaP
bash ecoflap_scripts/${task}/ecoflap.sh
# Fine-tune the pruned checkpoint obtained by ECoFLaP
bash ecoflap_scripts/${task}/ecoflap_finetuning.sh
LLaMA experiments
The main code for this part is in LLaMA/. Please do everything in LLaMA/ by cd LLaMA/.
Installation
Follow the scripts in Install.md.
Scripts
I removed --cache_dir so the program will read the cache that store in $HF_HOME$ (if specified) or default cache directory.
# ECoFLaP
bash scripts/ecoflap_zero.sh 0
Bibtex
@inproceedings{Sung2024ECoFLaP,
author = {Yi-Lin Sung, Jaehong Yoon, Mohit Bansal},
title = {ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models},
booktitle = {International Conference on Learning Representations (ICLR)},
year = {2024},
}