xlite-dev
Results
4
repositories owned by
xlite-dev
Awesome-LLM-Inference
4.9k
Stars
330
Forks
4.9k
Watchers
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Awesome-DiT-Inference
478
Stars
24
Forks
478
Watchers
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
flux-faster
24
Stars
0
Forks
24
Watchers
A forked version of flux-fast that makes flux-fast even faster with cache-dit, 3.3x speedup on NVIDIA L20.
ffpa-attn
242
Stars
12
Forks
242
Watchers
🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.