ReinforcementLearning.jl
ReinforcementLearning.jl copied to clipboard
Spin off core packages
@HenriDeh What do you think about spinning off RLCore, RLEnvs, and RLBase into separate repos? I know we keep on having different discussions around these topics and it's hard to pull the trigger / settle on a 'perfect' solution. We could start by spinning off RLBase, see whether it's a net gain / loss and go from there.
(I had been skeptical in the past about keeping things in sync as they get split off, but things have gone quite well with RLTrajectories, so I'm feeling optimistic)
This means that we would have to carefully handle versioning and compats, as we have done with RLTrajectories. Which I think is a good thing.
RLBase is an interface package (used by Core/Zoo and Environments). RLCore is a package of common components to build algorithms. RLZoo is a collection of algorithm implementations, that use RLCore's components. RLExperiments is a collection of examples of training.
The main argument against is that contributions that span multiple subrepos will have to be split into separate PRs, one for each repo. But in that case, the CI of, say, RLZoo, will have to wait for the related PR in, say, RLCore to be merged and released before it can use the components put in RLCore that are necessary for the algorithm contribution in Zoo. Then if it does not work and further changes to RLCore are needed, then a new RLCore PR must be reopened etc. It could drastically slow down the contribution process.
I think this would not be an issue if RLCore was mature and thus would require only small maintenance, like RLBase. But RLCore is not mature. It does not currently contain a comprehensive set of tools to implement (or fix in our case) the algorithms that we need for Zoo to be back online.
The main argument in favor is that you only need to clone each package separately and dev each of them and voila. Makes the whole dev_mode problems go away.
So my take on this is, RLBase: we can try because I think we can call it mature (it barely ever moves); RLCore: we may need a roadmap before this is possible, it possibly involves refactoring all commented algorithms first...
Trajectories is kind of a weird case. To me, its content had its place in RLCore (and it was there before), and in fact it is simply imported and reexported by Core. It was splitted out because it was convenient to work this way as it is quite big in itself, and I think because it is abstract enough to potentially be used by other packages. The splitting is successful because of thorough testing, which one thing I think is lacking in RLCore: most things are tested by running an experiment in RLExps.
I will briefly add that my experience in having a bunch of small "tools" packages was quite negative. In POMDPs.jl, we used to have a bunch of different tools packages, but it was a HUGE pain to maintain and document, so we combined them all into POMDPTools.jl
The best practice that I have converged to is to have a separate, small, pristine interface package (this could be a Core package), but have the tools in one big package.
I think the most important thing to think about is people (different types of users, developers, etc.) rather than code. Ask yourself "What will be the user/developer's experience?" or "If someone has an idea (that may or may not be a good idea), what is their path to contributing it?"