LMFlow
LMFlow copied to clipboard
[Roadmap] LMFlow Roadmap
This document includes the features in LMFlow's roadmap. We welcome any discuss or contribute to the specific features at related Issues/PRs. 🤗
Main Features
- Data
- [x] DPO dataset format #867
- [ ] Conversation template in DPO #883
- [ ] jinja template
- [ ] Tools in conversation dataset #884 #892
- Model
- Backend
- [ ] 🏗️ Accelerate support
- Tokenization
- [ ] Tokenization update, using hf method
- Backend
- Pipeline
- Train/Finetune/Align
- [x] DPO (multi-gpu) #867
- [x] Iterative DPO #867 #883
- [ ] PPO
- [ ] LISA (multi-gpu, qwen2, chatglm) #899
- [ ] Batch size and learning rate recommendation (arxiv)
- [ ] No trainer version pipelines, allowing users to customize/modify based on their needs
- [ ] Sparse training for moe models #879
- Inference
- [x] vllm inference #860 #863
- [x] Reward model scoring #867
- [x] Multiple instances inference (vllm, rm, others) #883
- [ ] Inference checkpointing and resume from checkpoints
- [ ] Inference accelerate EAGLE
- Train/Finetune/Align
Usability
- [ ] ❗ 🏗️ Make some packages/functions (gradio, vllm, ray, etc.) optional, add conditional import. #905
- [ ] Inference method auto-downgrading (vllm>ds, etc.), and make
vllm
package optional - [ ] Merging similar model methods into
hf_model_mixin
Bug fixes
- [ ]
model.generate()
with dsz3 #861 - [ ]
merge_lora
lora with abs path merging - [ ]
load_dataset
long data fix #878 - [ ] 🏗️ src/lmflow/utils/common.py
create_copied_dataclass
compatibility when python version >= 3.10 (kw_only
issue) #903 #905
Issues left over from history
- [ ]
use_accelerator
->use_accelerate
typo fix (with Accelerate support PR) - [ ]
model_args.use_lora
leads to truncation of the sequence, mentioned in #867 - [ ] Make ports, addresses, and all other settings in distributed training tidy and clear (with Accelerate support PR)
Documentation
- [ ] Approx GPU memory requirement w.r.t model size & pipeline
- [ ] Dev handbook, indicating styles, test list, etc.