Add learning rate schedules
Add a few learning rate schedules to mlx.optimizers.
don't we need to add param groups for this as well?
I think this is separate from parameter groups. We may want parameter groups but I think that is a lower priority at the moment.
When implementing learning rate schedules, it is common to assign distinct learning rates to various parameter groups. For instance, in models with separate components like the Head and Body, updating their parameters may involve applying different learning rates to each (e.g., distinct rates for the Body and the Head).
Yea I'm not disagreeing that one can also have separate schedules for different parameters. But we can add schedules as a separate feature, without support for parameter groups. If we add parameter groups we can address the question of how to get them to interoperate with schedules at that time.