clearml-agent
clearml-agent copied to clipboard
Consider clarifying, is this an alternative for Kubeflow?
As a dummy who is evaluating different options for ML Ops, I don't have a full picture of how Kubeflow works. Does trains-agent
integrate with Kubeflow? Or is it a more R&D-friendly replacement?
Hi @austinkeller
Or is it a more R&D-friendly replacement?
Kind of, but also integrates with Kubeflow :)
Specifically, Kubeflow assumes all steps are self contained containers, and that data can be volume mounted etc.
In this aspect trains-agent
solves the containerization problem and adds logging into the process.
To understand how trains
work, usually the dev steps are:
- Write code on "local" machine. Using trains all the code/environment/arguments are logged (including a few other stuff, but less relevant to our case)
- Clone experiment in UI (or from code / automation)
- Put code into execution queue (the trains scheduler,it also includes priorities etc, with UI as part of the system UI, see
trains-server
) -
trains-agent
running on remote machine in daemon setup, pulls the experiment from the execution queue, sets the environment accordingly and launch / monitor the process
Back to KubeFlow, since creating the experiment is done automatically (see step (1) trains
records the environment and creates the experiment in runtime), trains-agent
can build a docker container for the experiment to later be used by Kubeflow. This makes the packaging a lot easier (see trains-agent build --docker
) . You can actually make it even lighter, and use trains-agent
to setup and launch an experiment without packaging the experiment, but by using a base container and letting trains-agent
setup everything inside the container (see trains-agent execute
).
Does that remove a bit of the mystery ? What exactly is your use case ? (Is it more development oriented, or productization stage ?)