stable-baselines3
stable-baselines3 copied to clipboard
[Question] C++ Inference
Question
Hello, I am using SB3 to train some model where I want the inference to run on embedded robots using C++. I had a look at PyTorch documentation and doing this is not very hard "by hand", but the process could indeed be automated.
I could maybe contribute, but I would like some questions to be answered about how do you imagine the feature.
In my opinion, there is more that can be done that simply documenting it, because we could help to automate the observation and action rescaling and normalizing, in a way such that methods like model.predict
gets converted seamlessly to C++.
Here is how I would see it:
- We provide some piece of C++
- We provide the methods that export the JIT PyTorch models (
.pt
) from the different agents implemented in SB3 - We provide some code that generates some C++ (at least we need to be able to export the action and observation definitions so that the pre/post processing of observations and actions are implemented in C++ as well)
What do you think ?
Additional context
Checklist
- [X] I have read the documentation (required)
- [X] I have checked that there is no similar issue in the repo (required)
Hello, that would be a valuable extension to SB3 but should be done in the RL Zoo I think (or in an external repo).
Here is how I would see it:
Is that different options or a list of features?
but yes, at the end would be nice to have something like python sb3_export.py --algo a2c --env MyEnv -i path_to_model
or in the case of the RL Zoo: python sb3_export.py --algo a2c --env MyEnv -f exp_folder --exp-id 1
(it fins the experiment folder automatically and also loads the normalization and other wrappers if needed)
I would like some questions to be answered about how do you imagine the feature.
what are your questions?
My list is a list of features.
About questions, I mostly wanted to know what was the recommended direction for this. Making it some contrib to RL Zoo indeed looks like the way to go, since the export can be achieve "from the outside" or SB3.
Making it some contrib to RL Zoo indeed looks like the way to go, since the export can be achieve "from the outside" or SB3.
Feel free to open a draft PR there if you want to discuss it in more details ;)
I started working on that in the RL Zoo, I will indeed open a draft PR soon, even if it will not support everything it can be used as a base for discussions
I have a very short term goals of embedding inferences in our humanoid robots so I will also be the first user
Ok, I started a draft PR
https://github.com/DLR-RM/rl-baselines3-zoo/pull/228/
Design choices
- The export is initiated through
enjoy.py
, since I didn't wanted to duplicate or factor out the environment loading logic, to start the export pass--export-cpp target_directory
toenjoy.py
(supplementing all the usual flags to load your model) - There is a
cpp/
directory in the RL Zoo that is a "skeleton" of C++ project, when the export starts the directory is created from this "template" - For each environment, a .h and a .cpp are generated, they contain:
- The model type
- Code to preprocess observations
- Code to post-process action
- So far only action inference is provided
- Models are exported using PyTorch JIT script tracer, they are embedded in the binary as ressources using CMRC. This is convenient currently but if we want to be able to load multiple models and/or use big models we might want to use external files
- You can run multiple times
--export-cpp
for multiple environments, it will add/update the classes in the target project
A procedure to test
To test that it indeed works, I added an option to generate a Python binding while building, so that we can directly use Python's gym to test it. Here are the steps:
- Be sure you have Pytorch installed and in your
CMAKE_PREFIX_PATH
- Install
pybind11
, with for instanceapt-get install python3-dev python3-pybind11
- Train, for instance
DQN
withCartPole-v1
and then run something like:python enjoy.py --env CartPole-v1 --algo dqn -f logs/ --export-cpp predictors/
- Go in predictor and build:
-
cd predictors
-
mkdir build/
-
cmake -DBASELINES3_PYBIND=ON ..
-
make -j8
-
- This should produce a
libbaselines3_models.so
, that is where your predictors are - This should also produce something like
baselines3_py.cpython-36m-x86_64-linux-gnu.so
, this can allow you to test that it works using python env
From here you can test with such a script:
import gym
from baselines3_py import CartPole_v1
cp = CartPole_v1()
env = gym.make('CartPole-v1')
obs = env.reset()
while True:
action = cp.predict(obs)
obs, reward, done, info = env.step(int(action[0]))
env.render("human")
if done:
obs = env.reset()
obs = gym.reset()
It should show you the CartPole, using the C++ built library for prediction. You can also of course build without the Python binding and use the library from your C++ code.
An example can be found in predict.cpp
that is build as binary if you set BASELINES3_BIN
to ON
(hard coded for CartPole-v1
).
Maybe you are just looking for something like this... https://onnxruntime.ai/docs/
@stalek71 thanks for the lead, however the problem here is not exactly to save a PyTorch model to a C++ executable file and load it (which is explained in [1]), but to export SB3 models (at least in first place model.predict()
) to C++
More specifically, it implies:
- Export gym env's action and observation spaces for possible rescaling
- Export relevant neural networks (this depends on the underlying agent)
- Having the inference logic translated to C++ (for instance in case of DQN, it means taking the argmax of the output Q-values, but for TD3 it means taking the output of an actor network and so on)
This is not very complicated matter but it requires some specific knowledge of how underlying RL agents works. So I guess it's good to have the whole process automated.
(The use case is: I want the thing running in a robot without running any Python code (because it is embedded real-time robotics application).)
[1] https://pytorch.org/tutorials/advanced/cpp_export.html
Thanks for the PR =)
The export is initiated through enjoy.py, since I didn't wanted to duplicate or factor out the environment loading logic, to start the export pass --export-cpp target_directory to enjoy.py (supplementing all the usual flags to load your model)
sounds good.
So far only action inference is provided
I think we should keep the first version as simple as possible (for instance limiting ourselves to a subset of models or action spaces)
, they are embedded in the binary as ressources using CMRC
how much more difficult is it to just give a path to torch.jit.load()
?
Go in predictor and build:
this could be even automated, no?
from baselines3_py import CartPole_v1
I would rather keep the name of the algorithm (or concatenate it with the name of the env) to avoid confusion.
This should produce a libbaselines3_models.so, that is where your predictors are
Do you have also an example cpp file to show how to use that shared lib?
for the name, we will discuss it too (whether it should be baselines3
or sb3
or stablebaselines3
, I would lean towards the last two ;))
action = cp.predict(obs)
This is not consistent with SB3 API, but I think it's fine as it is target towards deployment (and only RecurrentPPO requires a hidden state).
how much more difficult is it to just give a path to
torch.jit.load()?
It is not, I really can make it this way, dropping the dependency with CMRC or making it optional
this could be even automated, no?
Enabling the Python binding is more like a test than a real use case. There is likely not that much performance boost because most of the computation is actually achieved by PyTorch, it's just about embedding it in C++.
But yes, we could automate the build and run of Python tests on the top of library for unit test purposes, CI and so
Do you have also an example cpp file to show how to use that shared lib?
So far just the simple: https://github.com/Gregwar/rl-baselines3-zoo/blob/export_cpp/cpp/src/predict.cpp
, dropping the dependency with CMRC or making it optional
Less dependencies is usually better ;)
Enabling the Python binding is more like a test than a real use case. T
I meant automating the build of the shared lib, but I probably misunderstood what you wrote.
we could automate the build and run of Python tests on the top of library for unit test purposes
this would be nice to have at least some test on the CI (nothing too complicated)
So far just the simple:
thanks =)
Closing that one in favor of another one that gonna be opened soon (per discussion with Grégoire).