Fast-LLM
Fast-LLM copied to clipboard
Generalize dynamic config classes
✨ Description
Fix: #126
Generalize the concept of dynamic config class from the dataset mechanism to all config class. I opted for a unique global registry of all config classes. This choice is motivated by simplicity (multiple registries would be hell to manage), though it means registry keys need to be globally unique so they'll tend to be longer. Config classes are now specified through the type field, using the config class name or its shorthand without the Config suffix.
Included a proposal to simplify the cli, so the runnable is selected through its type field, ex. fast-llm type=GPTTrainer ... instead of fast-llm train gpt.
Added backward compatibility for the old dataset dynamic type names.
TODO:
- Add backward compatibility for the cli?
- The
typefield conflicts with normalization, rotary and peft types. These are good use cases for dynamic classes so I'll implement, but I'll wait after #237. - Some other potential use cases in the optimizer (Lr schedule), but no need to do in this PR.
🔍 Type of change
Select all that apply:
- [x] 🐛 Bug fix (non-breaking change that addresses a specific issue)
- [x] 🚀 New feature (non-breaking change that adds functionality)
- [x] ⚠️ Breaking change (a change that could affect existing functionality)
- [ ] 📈 Performance improvement/optimization (improves speed, memory usage, or efficiency)
- [x] 🛠️ Code refactor (non-functional changes that improve code readability, structure, etc.)
- [ ] 📦 Dependency bump (updates dependencies, including Dockerfile or package changes)
- [ ] 📝 Documentation change (updates documentation, including new content or typo fixes)
- [ ] 🔧 Infrastructure/Build change (affects build process, CI/CD, or dependencies)
looks great so far!
fast-llm type=GPTTrainer is principled (because it taps into the override logic) but ugly (because spelling out type= is mandatory and because it's using class names as values). I think the better approach would be to still use the subcommand trick and force that commands set type to a str.
about the global registry thing. another approach would be to outsource the functionality to a mixin class that can be optionally added to a config class. that way we can have more control over where this feature is used, and it would also for having different registries for different config base classes.
fast-llm type=GPTTraineris principled (because it taps into the override logic) but ugly (because spelling outtype=is mandatory and because it's using class names as values). I think the better approach would be to still use the subcommand trick and force that commands settypeto astr.
It would be a bit of a hack, but I can make it work.
about the global registry thing. another approach would be to outsource the functionality to a mixin class that can be optionally added to a config class. that way we can have more control over where this feature is used, and it would also for having different registries for different config base classes.
I considered several alternatives, but they all tend to run into trouble once we start dealing with more than just a few registries. The user has to think about what registry a class belongs to when setting the name, and make sure to pick a unique name for that one, so the benefit of having shorter names would come at a cost to developers. Also there is the issue of classes potentially belonging to multiple registries, which can be solved at the price of extra complexity and varying type field name, which makes things even worse.
I tried to think of a developer-friendly solution for a while but ended up failing to, so went for a single global registry instead. It's good enough because the validation mechanism ensures the selected class has the expected base class. The only downsides are the less user-friendly names and the impossibility to restrict which subclasses are allowed for a given field.
@tscholak I started working on more dynamic classes and realized user-friendly names are essential. So I implemented a simple solution where each class can have its own registry, and the class is selected by iterating through the registries in the MRO. We still build a Config registry from class names, and others can be created with a simple register call.
So we get neat names without adding complexity on the developer side. The only real missing piece would some way for users to find out the available types, we should add it to automatic config documentation once we have it.
yes, this is great!