vod
vod copied to clipboard
🔮 Project: Large Language Model Efficiency Challenge
WHY The costs of accessing, fine-tuning and querying foundation models to perform new tasks are large. Given these costs, access to performant LLMs has been gated behind expensive and often proprietary hardware used to train models, making them inaccessible to those without substantial resources. This project aims to explore the latest innovation in the way LLMs are adapted for specific tasks, considering constraints of GPU resources, while maintaining performance quality.
HOW The challenge is set with specific constraints and an ambitious goal:
- Constraint: Adapt a foundation model to specific tasks by fine-tuning on a single GPU (A100) within a 24-hour (1-day) time frame.
- Goal: Maintain high accuracy for the desired tasks.
Techniques to be explored and analyzed include:
-
Low-Rank Adaptation (LoRA): Designing adapters as the product of two low-rank matrices. Building on insights showing that pre-trained language models can learn efficiently in a smaller subspace.
-
QLoRA: Building on LoRA with a 4-bit quantized model. Innovations include 4-bit NormalFloat, double quantization, and paged optimizers.
-
Lightning/FlashAttention/DeepSpeed/FairScale: Utilizing external tools/plugins to enhance data usage, training efficiency, and model quality.
-
Advanced topic - Blackbox LoRA: Current optimization methods rely on backpropagating through the whole model. Blackbox optimization consists of optimizing this small set of weights without backprop. Contact Valentin for more details about the theory.
WHAT The results of this project will lead to:
- Insights and Lessons: A distilled set of well-documented steps and easy-to-follow tutorials that encapsulate the learnings from the challenge.
- Innovation in Efficiency: Uncovering new techniques and methods that can significantly impact the way VOD trained models are adapted and fine-tuned.
References Low-Rank Adaptation (LoRA) QLoRA FlashAttention Lightning Fabric DeepSpeed FairScale NeurIPS LLM Efficiency Challenge