gz-sim icon indicating copy to clipboard operation
gz-sim copied to clipboard

Distributed sensors first implementation

Open Blast545 opened this issue 2 years ago • 8 comments

First iteration towards #942

Summary

First demo to approach distributed simulation through distributing the rendering parts. The current implementation does not distribute the components itself, it reuses the existing logic defined to distribute performers to distribute the sensors/postUpdate components associated with each performer.

##Workflow of the implementation

  1. Primary populates affinities distributing performers evenly.
  2. Primary sets up its simulationRunner to run the systems.
  3. Secondaries set up their simulationRunner to run systems with postUpdate methods. Specifically, the sensors plugin. (see secondary.sdf)
  4. Primary manages the step/clock of the simulation, full synchronization.
  5. Stepping starts with the primary that runs the preUpdate and update sections of the simulation.
  6. Primary sends the serialized state to the secondaries.
  7. Each secondary runs their assigned postUpdate tasks and replies with an acknowledge to the primary.
  8. After primary receives the acknowledge from all the secondaries, it continues with the next step.

(Is it necessary to send the serialized state back from each secondary to the primary? AFAIK the postUpdate phase does not change the ecm components state. If I'm wrong, the secondaries should send the state of their assigned components for the primary to consolidate them).

Test it

Run in three different terminals: ./examples/scripts/distributed_rendering/primary.sh ./examples/scripts/distributed_rendering/secondary.sh ./examples/scripts/distributed_rendering/secondary.sh

The sensors_demo.sdf was adjusted as demo for this PR. Half of the sensors are run in each secondary. Physics step is run in the primary,

Checklist

  • [ ] Signed all commits for DCO
  • [ ] Added tests
  • [ ] Added example and/or tutorial
  • [ ] Updated documentation (as needed)
  • [ ] Updated migration guide (as needed)
  • [ ] codecheck passed (See contributing)
  • [ ] All tests passed (See test coverage)
  • [ ] While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge

Signed-off-by: Jorge Perez [email protected]

Blast545 avatar Aug 19 '21 14:08 Blast545

Code from last demo is tracked up to b14ac8e. Not changing into formal PR because leaving the simulation running for some time crashes my PC. I'm investigating the cause of that.

Blast545 avatar Aug 31 '21 18:08 Blast545

Included an extra command line option --network-render to be able of using this sensors implementation without removing the current performers implementation.

Blast545 avatar Sep 07 '21 13:09 Blast545

ign-gazebo4 is EOL'ing with Dome in the coming days, please retarget this PR at a supported version.

chapulina avatar Dec 21 '21 01:12 chapulina

What is the current status? Are there any plans to integrate this into Fortress? How can I help you?

FilipeAlmeida-Movai avatar Jan 24 '22 09:01 FilipeAlmeida-Movai

Are there any plans to integrate this into Fortress? How can I help you?

Yes, this PR is near the top of my review plate. Feel free to try it out, and offer feedback. More eyes and testing is always better.

There is also distributed simulation work being don in #481.

nkoenig avatar Jan 24 '22 18:01 nkoenig

I tested! This improves a lot the simulation performance and divide the sensors gpu process. But I notice that some sensors are not rendering all objects in the scene. I am available to do more tests and share my feedback. Also, fell free to contact me to talk about. I am sharing a picture of one of my tests with our robots showing that the sensors are not render some objects in the scene.

image

FilipeAlmeida-Movai avatar Feb 01 '22 15:02 FilipeAlmeida-Movai

@FilipeAlmeida-Movai Thanks for your interest in this PR! Can you tell us the setup you used for testing it and the .sdf file you have for these pictures?

At the time this PR was opened we only ran tests with the sensors available in the .sdf files uploaded with this PR, it's mainly a proof of concept to check if distributing the sensors lead to a better performance than the idea proposed with #481

Blast545 avatar Feb 03 '22 16:02 Blast545

@FilipeAlmeida-Movai Thanks for your interest in this PR! Can you tell us the setup you used for testing it and the .sdf file you have for these pictures?

At the time this PR was opened we only ran tests with the sensors available in the .sdf files uploaded with this PR, it's mainly a proof of concept to check if distributing the sensors lead to a better performance than the idea proposed with #481

Yes. I ran the tests scripts that are in the description of this PR, and I got the same results. It is dividing the render scene between the secondaries, not only the sensors. So a sensor only "sees" what is rendering by the secondary that it was assigned.

You can run the tests and check if all cameras are getting all objects.

I already checked in the code where it is dividing the objects by secondaries and try a modification adapting performers and separate only the render of each performer instead of separate the entire scene and test. I run it and work! I was able to run 2 robots and separate the computation of the sensor of each one by the secondaries.

Furthermore, I am really interested in this PR. I think that dividing the sensors rendering by different GPU process is the best approach to improve the performance of the robots simulation. Please, let me know if I can help with anything!

FilipeAlmeida-Movai avatar Feb 03 '22 18:02 FilipeAlmeida-Movai

This has been open for quite some time, I don't think there is any intention to pick it up in the near future. I'm going to close it.

mjcarroll avatar Apr 21 '23 11:04 mjcarroll