MIT HAN Lab

Results 41 repositories owned by MIT HAN Lab

TinyChatEngine

569
Stars
55
Forks
Watchers

TinyChatEngine: On-Device LLM Inference Library

efficientvit

1.8k
Stars
164
Forks
Watchers

EfficientViT is a new family of vision models for efficient high-resolution vision.

spatten

66
Stars
7
Forks
Watchers

[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

offsite-tuning

361
Stars
36
Forks
Watchers

Offsite-Tuning: Transfer Learning without Full Model

distrifuser

557
Stars
21
Forks
Watchers

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

flatformer

96
Stars
9
Forks
Watchers

[CVPR'23] FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

llm-awq

1.6k
Stars
115
Forks
Watchers

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

sparsevit

46
Stars
2
Forks
Watchers

[CVPR'23] SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

streaming-llm

6.0k
Stars
346
Forks
Watchers

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks