dm_control
dm_control copied to clipboard
Open for collab - to rewrite dm_control codebase in c++
I enjoy this library and have used it in research , I'm currently writing my graduate paper, where i'll be showing my benchmark for the rgb stacking problem. Looking through the codebase, I see lots of optimization capability if a high performance language like c++ was used instead, it can directly interface with mujoco since that' in C. I am looking to collaborate with other c++ developers to create a c++ version and maybe write a python wrapper over that instead. I get really low FP during training, and not so feasible to train on a few cores. I have a private repo where i implemented PPO, A2C and impala with full multi-threading, MPI and template support, and the lib-torch api. I am open to share that with any developer who is willing to collaborate too.
Hi, ava. How is your graduation project going? Similarly, I will be graduating next year with my PhD and have recently become very interested in the C++ version of dm_control and would love to be a part of this great endeavor of yours!
oh great this was a long time ago, i didn't get a response so i didnt start it. but if you’re interested, i’m down to discuss more. i currently work as a software engr at microsoft, it’s great you’re currently pursing a PHD
On Sat, Mar 25, 2023 at 9:36 PM TneitaP @.***> wrote:
Hi, ava. How is your graduation project going? Similarly, I will be graduating next year with my PhD and have recently become very interested in the C++ version of dm_control and would love to be a part of this great endeavor of yours!
— Reply to this email directly, view it on GitHub https://github.com/deepmind/dm_control/issues/265#issuecomment-1483992754, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK5HQIZ4LHKWB26ZEBVLNKTW57BUVANCNFSM5QSIPRTA . You are receiving this because you authored the thread.Message ID: @.***>
If you're developing native tasks, you should take a look at the structure of the reward/cost functions in the MJPC project. While RL tasks require more specification (e.g., observations), it would be great if you could comply with that format (e.g., to validate your reward in real-time through MJPC optimization before using it in RL, which is often much slower).
I'll leave the issue open as an advertisement for your project. I wish you luck -- don't hesitate to open a new issues if you run into any difficulties.