Description

[x] DataParallel demo
[x] DistributedDataParallel demo
[x] tb logger example
[x] Distributed RL demo (Ape-X type)
[ ] Distributed RL demo (APPO type)
[ ] Distributed RL demo (R2D2 type)
[ ] Distributed RL demo (IMPALA type)
[ ] Distributed RL demo (SEED RL type)

Related Issue

#102 #176

TODO

Check List

[ ] merge the latest version source branch/repo, and resolve all the conflicts
[ ] pass style check
[ ] pass all the tests

May 15 '22 11:05 PaParaZz1

Codecov Report

Merging #321 (dde6009) into main (dd2b3a5) will decrease coverage by 0.59%. The diff coverage is 79.88%.

@@            Coverage Diff             @@
##             main     #321      +/-   ##
==========================================
- Coverage   85.39%   84.79%   -0.60%     
==========================================
  Files         532      556      +24     
  Lines       43943    44718     +775     
==========================================
+ Hits        37523    37919     +396     
- Misses       6420     6799     +379

Flag	Coverage Δ
unittests	`84.79% <79.88%> (-0.60%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
ding/data/buffer/tests/test_buffer_benchmark.py	`37.70% <ø> (ø)`
ding/entry/tests/test_cli_ditask.py	`100.00% <ø> (ø)`
ding/policy/base_policy.py	`74.85% <ø> (+0.84%)`	:arrow_up:
ding/policy/sac.py	`60.29% <ø> (-0.08%)`	:arrow_down:
...ework/middleware/functional/termination_checker.py	`22.50% <16.66%> (-8.75%)`	:arrow_down:
ding/data/tests/test_model_loader.py	`23.63% <23.63%> (ø)`
ding/framework/tests/test_task.py	`92.50% <35.71%> (-7.50%)`	:arrow_down:
ding/framework/middleware/functional/trainer.py	`84.84% <40.00%> (-3.04%)`	:arrow_down:
ding/policy/dqn.py	`87.34% <40.00%> (-1.54%)`	:arrow_down:
ding/framework/middleware/functional/enhancer.py	`39.65% <42.85%> (-0.73%)`	:arrow_down:
... and 271 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

May 15 '22 12:05 codecov[bot]

What is the throughput of this? Does this beat SampleFactory? @PaParaZz1 @sailxjx

Jun 21 '22 21:06 zxzzz0

@zxzzz0 This is not to compare the speed with sample factory, because you know that the bottleneck of RL training may appear in any one of the collecting, training, and evaluation, for example, too fast collecting may lead to too much generation difference and underfitting of the model, and because of the existence of GIL, the deserialization of data on the training process will also slow down the overall training efficiency, and there are many points that we need to consider in this project. This issue will provide a new design pattern for global RL training, starting from the idea that users can easily scale from single-computer studies to large-scale distributed systems without very large code modification costs or performance losses. If you are really very very concerned about environment-side collecting, then you can use sample factory inside the di-engine to achieve the collecting efficiency you expect.

Jun 22 '22 01:06 sailxjx

If you are really very very concerned about environment-side collecting

No. To clarify, we only care about overall performance, which means the time it will take to reach certain reward in the end.

Usually if you can squeeze every drop of performance out of the CPU/GPU, you can learn faster. Environment-side collecting is just one indicator and there are many other indicators as well. You will also have to pay attention to the learner FPS, GPU utilization and other indicators for you to understand the throughput of the whole system.

When doing benchmarking, it's not targeted for the collector side but the overall growth speed of reward.

Jun 22 '22 14:06 zxzzz0

No. To clarify, we only care about overall performance, which means the time it will take to reach certain reward in the end.

Yeah, that's right, the purpose of the distributed version is to maximize overall performance while not requiring too much effort to write code on multiple tasks.

Another consideration is that we need to go design-first. Only after the upper layer interface is unified and stable, it will be possible to gradually optimize all aspects of performance without disturbing the user. You can see that from version 0.x to version 1.0, we have gradually developed a definite interface style example, and the purpose of this branch is to extend this interface style to distributed operation.

Jun 23 '22 02:06 sailxjx

Sounds good. In the future, please benchmark different design/interface so that you are confident enough to say that you've chosen the design with the best overall performance.

If you don't do benchmarking (as I did before for di-engine) and you find something that you could improve the performance after the design is frozen in version 1.0, you can't change it without a major version update.

Jun 23 '22 14:06 zxzzz0

DI-engine
DI-engine copied to clipboard

feature(nyz): add new middleware distributed demo

Description

Related Issue

TODO

Check List

Codecov Report

DI-engine DI-engine copied to clipboard

feature(nyz): add new middleware distributed demo

Description

Related Issue

TODO

Check List

Codecov Report

DI-engine
DI-engine copied to clipboard