Fast-LLM icon indicating copy to clipboard operation
Fast-LLM copied to clipboard

Generalize dynamic config classes

Open jlamypoirier opened this issue 7 months ago • 4 comments

✨ Description

Fix: #126

Generalize the concept of dynamic config class from the dataset mechanism to all config class. I opted for a unique global registry of all config classes. This choice is motivated by simplicity (multiple registries would be hell to manage), though it means registry keys need to be globally unique so they'll tend to be longer. Config classes are now specified through the type field, using the config class name or its shorthand without the Config suffix.

Included a proposal to simplify the cli, so the runnable is selected through its type field, ex. fast-llm type=GPTTrainer ... instead of fast-llm train gpt.

Added backward compatibility for the old dataset dynamic type names.

TODO:

  • Add backward compatibility for the cli?
  • The type field conflicts with normalization, rotary and peft types. These are good use cases for dynamic classes so I'll implement, but I'll wait after #237.
  • Some other potential use cases in the optimizer (Lr schedule), but no need to do in this PR.

🔍 Type of change

Select all that apply:

  • [x] 🐛 Bug fix (non-breaking change that addresses a specific issue)
  • [x] 🚀 New feature (non-breaking change that adds functionality)
  • [x] ⚠️ Breaking change (a change that could affect existing functionality)
  • [ ] 📈 Performance improvement/optimization (improves speed, memory usage, or efficiency)
  • [x] 🛠️ Code refactor (non-functional changes that improve code readability, structure, etc.)
  • [ ] 📦 Dependency bump (updates dependencies, including Dockerfile or package changes)
  • [ ] 📝 Documentation change (updates documentation, including new content or typo fixes)
  • [ ] 🔧 Infrastructure/Build change (affects build process, CI/CD, or dependencies)

jlamypoirier avatar Apr 30 '25 18:04 jlamypoirier

looks great so far!

fast-llm type=GPTTrainer is principled (because it taps into the override logic) but ugly (because spelling out type= is mandatory and because it's using class names as values). I think the better approach would be to still use the subcommand trick and force that commands set type to a str.

about the global registry thing. another approach would be to outsource the functionality to a mixin class that can be optionally added to a config class. that way we can have more control over where this feature is used, and it would also for having different registries for different config base classes.

tscholak avatar May 02 '25 17:05 tscholak

fast-llm type=GPTTrainer is principled (because it taps into the override logic) but ugly (because spelling out type= is mandatory and because it's using class names as values). I think the better approach would be to still use the subcommand trick and force that commands set type to a str.

It would be a bit of a hack, but I can make it work.

about the global registry thing. another approach would be to outsource the functionality to a mixin class that can be optionally added to a config class. that way we can have more control over where this feature is used, and it would also for having different registries for different config base classes.

I considered several alternatives, but they all tend to run into trouble once we start dealing with more than just a few registries. The user has to think about what registry a class belongs to when setting the name, and make sure to pick a unique name for that one, so the benefit of having shorter names would come at a cost to developers. Also there is the issue of classes potentially belonging to multiple registries, which can be solved at the price of extra complexity and varying type field name, which makes things even worse.

I tried to think of a developer-friendly solution for a while but ended up failing to, so went for a single global registry instead. It's good enough because the validation mechanism ensures the selected class has the expected base class. The only downsides are the less user-friendly names and the impossibility to restrict which subclasses are allowed for a given field.

jlamypoirier avatar May 02 '25 22:05 jlamypoirier

@tscholak I started working on more dynamic classes and realized user-friendly names are essential. So I implemented a simple solution where each class can have its own registry, and the class is selected by iterating through the registries in the MRO. We still build a Config registry from class names, and others can be created with a simple register call.

So we get neat names without adding complexity on the developer side. The only real missing piece would some way for users to find out the available types, we should add it to automatic config documentation once we have it.

jlamypoirier avatar May 05 '25 23:05 jlamypoirier

yes, this is great!

tscholak avatar May 06 '25 17:05 tscholak