dreamer-pytorch icon indicating copy to clipboard operation
dreamer-pytorch copied to clipboard

some bug with "python main.py" ~v~

Open kabuwaniu opened this issue 2 years ago • 1 comments

Dear authers, after run “python main.py”, there is a error.

run 0 already exists. run 1 already exists. run 2 already exists. run 3 already exists. run 4 already exists. run 5 already exists. run 6 already exists. run 7 already exists. run 8 already exists. run 9 already exists. run 10 already exists. run 11 already exists. Using run id = 12 2022-07-13 11:45:12.949051 | dreamer_pong_12 Runner master CPU affinity: [0, 1, 2, 3, 4, 5, 6, 7]. 2022-07-13 11:45:12.949116 | dreamer_pong_12 Runner master Torch threads: 4. using seed 970 2022-07-13 11:45:14.629311 | dreamer_pong_12 Sampler decorrelating envs, max steps: 0 2022-07-13 11:45:14.629631 | dreamer_pong_12 Serial Sampler initialized. 2022-07-13 11:45:14.629661 | dreamer_pong_12 Running 5000000 iterations of minibatch RL. /home/uav-robot/anaconda3/envs/juliusfrost/lib/python3.8/site-packages/torch/optim/adam.py:90: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information super(Adam, self).init(params, defaults) 2022-07-13 11:45:14.630298 | dreamer_pong_12 Optimizing over 1000 iterations. 0% [##############################] 100% | ETA: 00:00:00 Total time elapsed: 00:00:03 2022-07-13 11:45:18.363206 | dreamer_pong_12 itr #999 saving snapshot... 2022-07-13 11:45:18.388679 | dreamer_pong_12 itr #999 saved 2022-07-13 11:45:18.396905 | ----------------------------- ------------ 2022-07-13 11:45:18.396941 | Diagnostics/NewCompletedTrajs 2 2022-07-13 11:45:18.396986 | Diagnostics/StepsInTrajWindow 1000 2022-07-13 11:45:18.397003 | Diagnostics/Iteration 999 2022-07-13 11:45:18.397028 | Diagnostics/CumTime (s) 3.75844 2022-07-13 11:45:18.397065 | Diagnostics/CumSteps 1000 2022-07-13 11:45:18.397090 | Diagnostics/CumCompletedTrajs 2 2022-07-13 11:45:18.397130 | Diagnostics/CumUpdates 0 2022-07-13 11:45:18.397179 | Diagnostics/StepsPerSecond 266.068 2022-07-13 11:45:18.397235 | Diagnostics/UpdatesPerSecond 0 2022-07-13 11:45:18.397261 | Diagnostics/ReplayRatio 0 2022-07-13 11:45:18.397300 | Diagnostics/CumReplayRatio 0 2022-07-13 11:45:18.397325 | Length/Average 500 2022-07-13 11:45:18.397365 | Length/Std 0 2022-07-13 11:45:18.397379 | Length/Median 500 2022-07-13 11:45:18.397402 | Length/Min 500 2022-07-13 11:45:18.397442 | Length/Max 500 2022-07-13 11:45:18.397466 | Return/Average -6 2022-07-13 11:45:18.397506 | Return/Std 0 2022-07-13 11:45:18.397520 | Return/Median -6 2022-07-13 11:45:18.397568 | Return/Min -6 2022-07-13 11:45:18.397582 | Return/Max -6 2022-07-13 11:45:18.397605 | NonzeroRewards/Average 6 2022-07-13 11:45:18.397645 | NonzeroRewards/Std 0 2022-07-13 11:45:18.397659 | NonzeroRewards/Median 6 2022-07-13 11:45:18.397682 | NonzeroRewards/Min 6 2022-07-13 11:45:18.397710 | NonzeroRewards/Max 6 2022-07-13 11:45:18.397721 | DiscountedReturn/Average -0.596882 2022-07-13 11:45:18.397731 | DiscountedReturn/Std 0.0359496 2022-07-13 11:45:18.397741 | DiscountedReturn/Median -0.596882 2022-07-13 11:45:18.397751 | DiscountedReturn/Min -0.632832 2022-07-13 11:45:18.397761 | DiscountedReturn/Max -0.560932 2022-07-13 11:45:18.397771 | GameScore/Average -6 2022-07-13 11:45:18.397801 | GameScore/Std 0 2022-07-13 11:45:18.397826 | GameScore/Median -6 2022-07-13 11:45:18.397854 | GameScore/Min -6 2022-07-13 11:45:18.397864 | GameScore/Max -6 2022-07-13 11:45:18.397890 | loss/Average nan 2022-07-13 11:45:18.397900 | loss/Std nan 2022-07-13 11:45:18.397910 | loss/Median nan 2022-07-13 11:45:18.397919 | loss/Min nan 2022-07-13 11:45:18.397929 | loss/Max nan 2022-07-13 11:45:18.397939 | grad_norm_model/Average nan 2022-07-13 11:45:18.397968 | grad_norm_model/Std nan 2022-07-13 11:45:18.397978 | grad_norm_model/Median nan 2022-07-13 11:45:18.398003 | grad_norm_model/Min nan 2022-07-13 11:45:18.398031 | grad_norm_model/Max nan 2022-07-13 11:45:18.398040 | grad_norm_actor/Average nan 2022-07-13 11:45:18.398065 | grad_norm_actor/Std nan 2022-07-13 11:45:18.398093 | grad_norm_actor/Median nan 2022-07-13 11:45:18.398102 | grad_norm_actor/Min nan 2022-07-13 11:45:18.398127 | grad_norm_actor/Max nan 2022-07-13 11:45:18.398161 | grad_norm_value/Average nan 2022-07-13 11:45:18.398187 | grad_norm_value/Std nan 2022-07-13 11:45:18.398216 | grad_norm_value/Median nan 2022-07-13 11:45:18.398226 | grad_norm_value/Min nan 2022-07-13 11:45:18.398236 | grad_norm_value/Max nan 2022-07-13 11:45:18.398246 | model_loss/Average nan 2022-07-13 11:45:18.398256 | model_loss/Std nan 2022-07-13 11:45:18.398266 | model_loss/Median nan 2022-07-13 11:45:18.398276 | model_loss/Min nan 2022-07-13 11:45:18.398285 | model_loss/Max nan 2022-07-13 11:45:18.398295 | actor_loss/Average nan 2022-07-13 11:45:18.398305 | actor_loss/Std nan 2022-07-13 11:45:18.398315 | actor_loss/Median nan 2022-07-13 11:45:18.398325 | actor_loss/Min nan 2022-07-13 11:45:18.398335 | actor_loss/Max nan 2022-07-13 11:45:18.398345 | value_loss/Average nan 2022-07-13 11:45:18.398355 | value_loss/Std nan 2022-07-13 11:45:18.398365 | value_loss/Median nan 2022-07-13 11:45:18.398374 | value_loss/Min nan 2022-07-13 11:45:18.398384 | value_loss/Max nan 2022-07-13 11:45:18.398394 | prior_entropy/Average nan 2022-07-13 11:45:18.398404 | prior_entropy/Std nan 2022-07-13 11:45:18.398414 | prior_entropy/Median nan 2022-07-13 11:45:18.398424 | prior_entropy/Min nan 2022-07-13 11:45:18.398434 | prior_entropy/Max nan 2022-07-13 11:45:18.398443 | post_entropy/Average nan 2022-07-13 11:45:18.398453 | post_entropy/Std nan 2022-07-13 11:45:18.398463 | post_entropy/Median nan 2022-07-13 11:45:18.398473 | post_entropy/Min nan 2022-07-13 11:45:18.398483 | post_entropy/Max nan 2022-07-13 11:45:18.398493 | divergence/Average nan 2022-07-13 11:45:18.398503 | divergence/Std nan 2022-07-13 11:45:18.398513 | divergence/Median nan 2022-07-13 11:45:18.398523 | divergence/Min nan 2022-07-13 11:45:18.398533 | divergence/Max nan 2022-07-13 11:45:18.398543 | reward_loss/Average nan 2022-07-13 11:45:18.398553 | reward_loss/Std nan 2022-07-13 11:45:18.398562 | reward_loss/Median nan 2022-07-13 11:45:18.398572 | reward_loss/Min nan 2022-07-13 11:45:18.398582 | reward_loss/Max nan 2022-07-13 11:45:18.398592 | image_loss/Average nan 2022-07-13 11:45:18.398602 | image_loss/Std nan 2022-07-13 11:45:18.398612 | image_loss/Median nan 2022-07-13 11:45:18.398621 | image_loss/Min nan 2022-07-13 11:45:18.398631 | image_loss/Max nan 2022-07-13 11:45:18.398641 | pcont_loss/Average nan 2022-07-13 11:45:18.398651 | pcont_loss/Std nan 2022-07-13 11:45:18.398661 | pcont_loss/Median nan 2022-07-13 11:45:18.398671 | pcont_loss/Min nan 2022-07-13 11:45:18.398681 | pcont_loss/Max nan 2022-07-13 11:45:18.398691 | ----------------------------- ------------ 2022-07-13 11:45:18.398820 | dreamer_pong_12 itr #999 Optimizing over 1000 iterations. 0% [##############################] 100% | ETA: 00:00:00 Total time elapsed: 00:00:03 2022-07-13 11:45:22.201126 | dreamer_pong_12 itr #1999 saving snapshot... 2022-07-13 11:45:22.230317 | dreamer_pong_12 itr #1999 saved 2022-07-13 11:45:22.239040 | ----------------------------- ----------- 2022-07-13 11:45:22.239081 | Diagnostics/NewCompletedTrajs 2 2022-07-13 11:45:22.239163 | Diagnostics/StepsInTrajWindow 2000 2022-07-13 11:45:22.239189 | Diagnostics/Iteration 1999 2022-07-13 11:45:22.239264 | Diagnostics/CumTime (s) 7.60008 2022-07-13 11:45:22.239300 | Diagnostics/CumSteps 2000 2022-07-13 11:45:22.239357 | Diagnostics/CumCompletedTrajs 4 2022-07-13 11:45:22.239444 | Diagnostics/CumUpdates 0 2022-07-13 11:45:22.239523 | Diagnostics/StepsPerSecond 260.306 2022-07-13 11:45:22.239546 | Diagnostics/UpdatesPerSecond 0 2022-07-13 11:45:22.239600 | Diagnostics/ReplayRatio 0 2022-07-13 11:45:22.239614 | Diagnostics/CumReplayRatio 0 2022-07-13 11:45:22.239664 | Length/Average 500 2022-07-13 11:45:22.239679 | Length/Std 0 2022-07-13 11:45:22.239729 | Length/Median 500 2022-07-13 11:45:22.239754 | Length/Min 500 2022-07-13 11:45:22.239796 | Length/Max 500 2022-07-13 11:45:22.239810 | Return/Average -5.75 2022-07-13 11:45:22.239834 | Return/Std 0.433013 2022-07-13 11:45:22.239874 | Return/Median -6 2022-07-13 11:45:22.239889 | Return/Min -6 2022-07-13 11:45:22.239912 | Return/Max -5 2022-07-13 11:45:22.239955 | NonzeroRewards/Average 5.75 2022-07-13 11:45:22.239979 | NonzeroRewards/Std 0.433013 2022-07-13 11:45:22.240006 | NonzeroRewards/Median 6 2022-07-13 11:45:22.240018 | NonzeroRewards/Min 5 2022-07-13 11:45:22.240028 | NonzeroRewards/Max 6 2022-07-13 11:45:22.240038 | DiscountedReturn/Average -0.537198 2022-07-13 11:45:22.240049 | DiscountedReturn/Std 0.116308 2022-07-13 11:45:22.240059 | DiscountedReturn/Median -0.587484 2022-07-13 11:45:22.240069 | DiscountedReturn/Min -0.632832 2022-07-13 11:45:22.240079 | DiscountedReturn/Max -0.340991 2022-07-13 11:45:22.240089 | GameScore/Average -5.75 2022-07-13 11:45:22.240099 | GameScore/Std 0.433013 2022-07-13 11:45:22.240109 | GameScore/Median -6 2022-07-13 11:45:22.240119 | GameScore/Min -6 2022-07-13 11:45:22.240129 | GameScore/Max -5 2022-07-13 11:45:22.240139 | loss/Average nan 2022-07-13 11:45:22.240150 | loss/Std nan 2022-07-13 11:45:22.240160 | loss/Median nan 2022-07-13 11:45:22.240170 | loss/Min nan 2022-07-13 11:45:22.240180 | loss/Max nan 2022-07-13 11:45:22.240190 | grad_norm_model/Average nan 2022-07-13 11:45:22.240200 | grad_norm_model/Std nan 2022-07-13 11:45:22.240210 | grad_norm_model/Median nan 2022-07-13 11:45:22.240220 | grad_norm_model/Min nan 2022-07-13 11:45:22.240230 | grad_norm_model/Max nan 2022-07-13 11:45:22.240240 | grad_norm_actor/Average nan 2022-07-13 11:45:22.240250 | grad_norm_actor/Std nan 2022-07-13 11:45:22.240260 | grad_norm_actor/Median nan 2022-07-13 11:45:22.240270 | grad_norm_actor/Min nan 2022-07-13 11:45:22.240280 | grad_norm_actor/Max nan 2022-07-13 11:45:22.240290 | grad_norm_value/Average nan 2022-07-13 11:45:22.240300 | grad_norm_value/Std nan 2022-07-13 11:45:22.240310 | grad_norm_value/Median nan 2022-07-13 11:45:22.240320 | grad_norm_value/Min nan 2022-07-13 11:45:22.240330 | grad_norm_value/Max nan 2022-07-13 11:45:22.240340 | model_loss/Average nan 2022-07-13 11:45:22.240350 | model_loss/Std nan 2022-07-13 11:45:22.240360 | model_loss/Median nan 2022-07-13 11:45:22.240370 | model_loss/Min nan 2022-07-13 11:45:22.240380 | model_loss/Max nan 2022-07-13 11:45:22.240390 | actor_loss/Average nan 2022-07-13 11:45:22.240400 | actor_loss/Std nan 2022-07-13 11:45:22.240410 | actor_loss/Median nan 2022-07-13 11:45:22.240420 | actor_loss/Min nan 2022-07-13 11:45:22.240430 | actor_loss/Max nan 2022-07-13 11:45:22.240440 | value_loss/Average nan 2022-07-13 11:45:22.240453 | value_loss/Std nan 2022-07-13 11:45:22.240464 | value_loss/Median nan 2022-07-13 11:45:22.240474 | value_loss/Min nan 2022-07-13 11:45:22.240484 | value_loss/Max nan 2022-07-13 11:45:22.240494 | prior_entropy/Average nan 2022-07-13 11:45:22.240504 | prior_entropy/Std nan 2022-07-13 11:45:22.240514 | prior_entropy/Median nan 2022-07-13 11:45:22.240524 | prior_entropy/Min nan 2022-07-13 11:45:22.240534 | prior_entropy/Max nan 2022-07-13 11:45:22.240544 | post_entropy/Average nan 2022-07-13 11:45:22.240553 | post_entropy/Std nan 2022-07-13 11:45:22.240563 | post_entropy/Median nan 2022-07-13 11:45:22.240573 | post_entropy/Min nan 2022-07-13 11:45:22.240583 | post_entropy/Max nan 2022-07-13 11:45:22.240593 | divergence/Average nan 2022-07-13 11:45:22.240603 | divergence/Std nan 2022-07-13 11:45:22.240613 | divergence/Median nan 2022-07-13 11:45:22.240623 | divergence/Min nan 2022-07-13 11:45:22.240633 | divergence/Max nan 2022-07-13 11:45:22.240643 | reward_loss/Average nan 2022-07-13 11:45:22.240653 | reward_loss/Std nan 2022-07-13 11:45:22.240663 | reward_loss/Median nan 2022-07-13 11:45:22.240673 | reward_loss/Min nan 2022-07-13 11:45:22.240683 | reward_loss/Max nan 2022-07-13 11:45:22.240693 | image_loss/Average nan 2022-07-13 11:45:22.240702 | image_loss/Std nan 2022-07-13 11:45:22.240712 | image_loss/Median nan 2022-07-13 11:45:22.240722 | image_loss/Min nan 2022-07-13 11:45:22.240732 | image_loss/Max nan 2022-07-13 11:45:22.240742 | pcont_loss/Average nan 2022-07-13 11:45:22.240752 | pcont_loss/Std nan 2022-07-13 11:45:22.240762 | pcont_loss/Median nan 2022-07-13 11:45:22.240772 | pcont_loss/Min nan 2022-07-13 11:45:22.240782 | pcont_loss/Max nan 2022-07-13 11:45:22.240792 | ----------------------------- ----------- 2022-07-13 11:45:22.240882 | dreamer_pong_12 itr #1999 Optimizing over 1000 iterations. 0% [##############################] 100% | ETA: 00:00:00 Total time elapsed: 00:00:04 2022-07-13 11:45:26.303541 | dreamer_pong_12 itr #2999 saving snapshot... 2022-07-13 11:45:26.333394 | dreamer_pong_12 itr #2999 saved 2022-07-13 11:45:26.344508 | ----------------------------- ------------ 2022-07-13 11:45:26.344557 | Diagnostics/NewCompletedTrajs 2 2022-07-13 11:45:26.344587 | Diagnostics/StepsInTrajWindow 3000 2022-07-13 11:45:26.344640 | Diagnostics/Iteration 2999 2022-07-13 11:45:26.344670 | Diagnostics/CumTime (s) 11.7032 2022-07-13 11:45:26.344712 | Diagnostics/CumSteps 3000 2022-07-13 11:45:26.344737 | Diagnostics/CumCompletedTrajs 6 2022-07-13 11:45:26.344776 | Diagnostics/CumUpdates 0 2022-07-13 11:45:26.344802 | Diagnostics/StepsPerSecond 243.718 2022-07-13 11:45:26.344845 | Diagnostics/UpdatesPerSecond 0 2022-07-13 11:45:26.344864 | Diagnostics/ReplayRatio 0 2022-07-13 11:45:26.344910 | Diagnostics/CumReplayRatio 0 2022-07-13 11:45:26.344924 | Length/Average 500 2022-07-13 11:45:26.344947 | Length/Std 0 2022-07-13 11:45:26.344990 | Length/Median 500 2022-07-13 11:45:26.345010 | Length/Min 500 2022-07-13 11:45:26.345055 | Length/Max 500 2022-07-13 11:45:26.345069 | Return/Average -5.66667 2022-07-13 11:45:26.345093 | Return/Std 0.471405 2022-07-13 11:45:26.345134 | Return/Median -6 2022-07-13 11:45:26.345163 | Return/Min -6 2022-07-13 11:45:26.345205 | Return/Max -5 2022-07-13 11:45:26.345219 | NonzeroRewards/Average 5.66667 2022-07-13 11:45:26.345243 | NonzeroRewards/Std 0.471405 2022-07-13 11:45:26.345282 | NonzeroRewards/Median 6 2022-07-13 11:45:26.345316 | NonzeroRewards/Min 5 2022-07-13 11:45:26.345364 | NonzeroRewards/Max 6 2022-07-13 11:45:26.345383 | DiscountedReturn/Average -0.548505 2022-07-13 11:45:26.345401 | DiscountedReturn/Std 0.0969068 2022-07-13 11:45:26.345417 | DiscountedReturn/Median -0.575386 2022-07-13 11:45:26.345432 | DiscountedReturn/Min -0.632832 2022-07-13 11:45:26.345449 | DiscountedReturn/Max -0.340991 2022-07-13 11:45:26.345464 | GameScore/Average -5.66667 2022-07-13 11:45:26.345480 | GameScore/Std 0.471405 2022-07-13 11:45:26.345495 | GameScore/Median -6 2022-07-13 11:45:26.345509 | GameScore/Min -6 2022-07-13 11:45:26.345524 | GameScore/Max -5 2022-07-13 11:45:26.345539 | loss/Average nan 2022-07-13 11:45:26.345553 | loss/Std nan 2022-07-13 11:45:26.345568 | loss/Median nan 2022-07-13 11:45:26.345584 | loss/Min nan 2022-07-13 11:45:26.345600 | loss/Max nan 2022-07-13 11:45:26.345616 | grad_norm_model/Average nan 2022-07-13 11:45:26.345630 | grad_norm_model/Std nan 2022-07-13 11:45:26.345644 | grad_norm_model/Median nan 2022-07-13 11:45:26.345658 | grad_norm_model/Min nan 2022-07-13 11:45:26.345673 | grad_norm_model/Max nan 2022-07-13 11:45:26.345689 | grad_norm_actor/Average nan 2022-07-13 11:45:26.345707 | grad_norm_actor/Std nan 2022-07-13 11:45:26.345723 | grad_norm_actor/Median nan 2022-07-13 11:45:26.345739 | grad_norm_actor/Min nan 2022-07-13 11:45:26.345755 | grad_norm_actor/Max nan 2022-07-13 11:45:26.345770 | grad_norm_value/Average nan 2022-07-13 11:45:26.345787 | grad_norm_value/Std nan 2022-07-13 11:45:26.345805 | grad_norm_value/Median nan 2022-07-13 11:45:26.345824 | grad_norm_value/Min nan 2022-07-13 11:45:26.345841 | grad_norm_value/Max nan 2022-07-13 11:45:26.345858 | model_loss/Average nan 2022-07-13 11:45:26.345874 | model_loss/Std nan 2022-07-13 11:45:26.345891 | model_loss/Median nan 2022-07-13 11:45:26.345908 | model_loss/Min nan 2022-07-13 11:45:26.345927 | model_loss/Max nan 2022-07-13 11:45:26.345945 | actor_loss/Average nan 2022-07-13 11:45:26.345964 | actor_loss/Std nan 2022-07-13 11:45:26.345982 | actor_loss/Median nan 2022-07-13 11:45:26.346001 | actor_loss/Min nan 2022-07-13 11:45:26.346020 | actor_loss/Max nan 2022-07-13 11:45:26.346039 | value_loss/Average nan 2022-07-13 11:45:26.346057 | value_loss/Std nan 2022-07-13 11:45:26.346075 | value_loss/Median nan 2022-07-13 11:45:26.346093 | value_loss/Min nan 2022-07-13 11:45:26.346111 | value_loss/Max nan 2022-07-13 11:45:26.346129 | prior_entropy/Average nan 2022-07-13 11:45:26.346146 | prior_entropy/Std nan 2022-07-13 11:45:26.346164 | prior_entropy/Median nan 2022-07-13 11:45:26.346181 | prior_entropy/Min nan 2022-07-13 11:45:26.346198 | prior_entropy/Max nan 2022-07-13 11:45:26.346216 | post_entropy/Average nan 2022-07-13 11:45:26.346234 | post_entropy/Std nan 2022-07-13 11:45:26.346250 | post_entropy/Median nan 2022-07-13 11:45:26.346266 | post_entropy/Min nan 2022-07-13 11:45:26.346284 | post_entropy/Max nan 2022-07-13 11:45:26.346303 | divergence/Average nan 2022-07-13 11:45:26.346321 | divergence/Std nan 2022-07-13 11:45:26.346339 | divergence/Median nan 2022-07-13 11:45:26.346364 | divergence/Min nan 2022-07-13 11:45:26.346383 | divergence/Max nan 2022-07-13 11:45:26.346402 | reward_loss/Average nan 2022-07-13 11:45:26.346419 | reward_loss/Std nan 2022-07-13 11:45:26.346437 | reward_loss/Median nan 2022-07-13 11:45:26.346457 | reward_loss/Min nan 2022-07-13 11:45:26.346476 | reward_loss/Max nan 2022-07-13 11:45:26.346493 | image_loss/Average nan 2022-07-13 11:45:26.346510 | image_loss/Std nan 2022-07-13 11:45:26.346527 | image_loss/Median nan 2022-07-13 11:45:26.346545 | image_loss/Min nan 2022-07-13 11:45:26.346563 | image_loss/Max nan 2022-07-13 11:45:26.346582 | pcont_loss/Average nan 2022-07-13 11:45:26.346600 | pcont_loss/Std nan 2022-07-13 11:45:26.346618 | pcont_loss/Median nan 2022-07-13 11:45:26.346637 | pcont_loss/Min nan 2022-07-13 11:45:26.346657 | pcont_loss/Max nan 2022-07-13 11:45:26.346676 | ----------------------------- ------------ 2022-07-13 11:45:26.346844 | dreamer_pong_12 itr #2999 Optimizing over 1000 iterations. 0% [##############################] 100% | ETA: 00:00:00 Total time elapsed: 00:00:04 2022-07-13 11:45:30.617001 | dreamer_pong_12 itr #3999 saving snapshot... 2022-07-13 11:45:30.642814 | dreamer_pong_12 itr #3999 saved 2022-07-13 11:45:30.651841 | ----------------------------- ----------- 2022-07-13 11:45:30.651886 | Diagnostics/NewCompletedTrajs 2 2022-07-13 11:45:30.651942 | Diagnostics/StepsInTrajWindow 4000 2022-07-13 11:45:30.651966 | Diagnostics/Iteration 3999 2022-07-13 11:45:30.652023 | Diagnostics/CumTime (s) 16.0126 2022-07-13 11:45:30.652075 | Diagnostics/CumSteps 4000 2022-07-13 11:45:30.652097 | Diagnostics/CumCompletedTrajs 8 2022-07-13 11:45:30.652140 | Diagnostics/CumUpdates 0 2022-07-13 11:45:30.652173 | Diagnostics/StepsPerSecond 232.051 2022-07-13 11:45:30.652220 | Diagnostics/UpdatesPerSecond 0 2022-07-13 11:45:30.652243 | Diagnostics/ReplayRatio 0 2022-07-13 11:45:30.652290 | Diagnostics/CumReplayRatio 0 2022-07-13 11:45:30.652312 | Length/Average 500 2022-07-13 11:45:30.652366 | Length/Std 0 2022-07-13 11:45:30.652424 | Length/Median 500 2022-07-13 11:45:30.652471 | Length/Min 500 2022-07-13 11:45:30.652497 | Length/Max 500 2022-07-13 11:45:30.652543 | Return/Average -5.5 2022-07-13 11:45:30.652567 | Return/Std 0.707107 2022-07-13 11:45:30.652596 | Return/Median -6 2022-07-13 11:45:30.652608 | Return/Min -6 2022-07-13 11:45:30.652618 | Return/Max -4 2022-07-13 11:45:30.652628 | NonzeroRewards/Average 5.5 2022-07-13 11:45:30.652638 | NonzeroRewards/Std 0.707107 2022-07-13 11:45:30.652647 | NonzeroRewards/Median 6 2022-07-13 11:45:30.652657 | NonzeroRewards/Min 4 2022-07-13 11:45:30.652667 | NonzeroRewards/Max 6 2022-07-13 11:45:30.652676 | DiscountedReturn/Average -0.496869 2022-07-13 11:45:30.652686 | DiscountedReturn/Std 0.166305 2022-07-13 11:45:30.652696 | DiscountedReturn/Median -0.563765 2022-07-13 11:45:30.652705 | DiscountedReturn/Min -0.632832 2022-07-13 11:45:30.652715 | DiscountedReturn/Max -0.117326 2022-07-13 11:45:30.652725 | GameScore/Average -5.5 2022-07-13 11:45:30.652734 | GameScore/Std 0.707107 2022-07-13 11:45:30.652744 | GameScore/Median -6 2022-07-13 11:45:30.652754 | GameScore/Min -6 2022-07-13 11:45:30.652763 | GameScore/Max -4 2022-07-13 11:45:30.652773 | loss/Average nan 2022-07-13 11:45:30.652783 | loss/Std nan 2022-07-13 11:45:30.652796 | loss/Median nan 2022-07-13 11:45:30.652807 | loss/Min nan 2022-07-13 11:45:30.652816 | loss/Max nan 2022-07-13 11:45:30.652826 | grad_norm_model/Average nan 2022-07-13 11:45:30.652836 | grad_norm_model/Std nan 2022-07-13 11:45:30.652846 | grad_norm_model/Median nan 2022-07-13 11:45:30.652855 | grad_norm_model/Min nan 2022-07-13 11:45:30.652865 | grad_norm_model/Max nan 2022-07-13 11:45:30.652874 | grad_norm_actor/Average nan 2022-07-13 11:45:30.652884 | grad_norm_actor/Std nan 2022-07-13 11:45:30.652894 | grad_norm_actor/Median nan 2022-07-13 11:45:30.652903 | grad_norm_actor/Min nan 2022-07-13 11:45:30.652913 | grad_norm_actor/Max nan 2022-07-13 11:45:30.652923 | grad_norm_value/Average nan 2022-07-13 11:45:30.652932 | grad_norm_value/Std nan 2022-07-13 11:45:30.652942 | grad_norm_value/Median nan 2022-07-13 11:45:30.652951 | grad_norm_value/Min nan 2022-07-13 11:45:30.652961 | grad_norm_value/Max nan 2022-07-13 11:45:30.652970 | model_loss/Average nan 2022-07-13 11:45:30.652980 | model_loss/Std nan 2022-07-13 11:45:30.652990 | model_loss/Median nan 2022-07-13 11:45:30.652999 | model_loss/Min nan 2022-07-13 11:45:30.653009 | model_loss/Max nan 2022-07-13 11:45:30.653018 | actor_loss/Average nan 2022-07-13 11:45:30.653028 | actor_loss/Std nan 2022-07-13 11:45:30.653038 | actor_loss/Median nan 2022-07-13 11:45:30.653047 | actor_loss/Min nan 2022-07-13 11:45:30.653057 | actor_loss/Max nan 2022-07-13 11:45:30.653067 | value_loss/Average nan 2022-07-13 11:45:30.653076 | value_loss/Std nan 2022-07-13 11:45:30.653086 | value_loss/Median nan 2022-07-13 11:45:30.653096 | value_loss/Min nan 2022-07-13 11:45:30.653105 | value_loss/Max nan 2022-07-13 11:45:30.653134 | prior_entropy/Average nan 2022-07-13 11:45:30.653149 | prior_entropy/Std nan 2022-07-13 11:45:30.653161 | prior_entropy/Median nan 2022-07-13 11:45:30.653187 | prior_entropy/Min nan 2022-07-13 11:45:30.653197 | prior_entropy/Max nan 2022-07-13 11:45:30.653207 | post_entropy/Average nan 2022-07-13 11:45:30.653216 | post_entropy/Std nan 2022-07-13 11:45:30.653226 | post_entropy/Median nan 2022-07-13 11:45:30.653236 | post_entropy/Min nan 2022-07-13 11:45:30.653246 | post_entropy/Max nan 2022-07-13 11:45:30.653255 | divergence/Average nan 2022-07-13 11:45:30.653265 | divergence/Std nan 2022-07-13 11:45:30.653275 | divergence/Median nan 2022-07-13 11:45:30.653284 | divergence/Min nan 2022-07-13 11:45:30.653294 | divergence/Max nan 2022-07-13 11:45:30.653304 | reward_loss/Average nan 2022-07-13 11:45:30.653313 | reward_loss/Std nan 2022-07-13 11:45:30.653323 | reward_loss/Median nan 2022-07-13 11:45:30.653333 | reward_loss/Min nan 2022-07-13 11:45:30.653343 | reward_loss/Max nan 2022-07-13 11:45:30.653369 | image_loss/Average nan 2022-07-13 11:45:30.653408 | image_loss/Std nan 2022-07-13 11:45:30.653425 | image_loss/Median nan 2022-07-13 11:45:30.653436 | image_loss/Min nan 2022-07-13 11:45:30.653462 | image_loss/Max nan 2022-07-13 11:45:30.653471 | pcont_loss/Average nan 2022-07-13 11:45:30.653481 | pcont_loss/Std nan 2022-07-13 11:45:30.653491 | pcont_loss/Median nan 2022-07-13 11:45:30.653501 | pcont_loss/Min nan 2022-07-13 11:45:30.653535 | pcont_loss/Max nan 2022-07-13 11:45:30.653546 | ----------------------------- ----------- 2022-07-13 11:45:30.653701 | dreamer_pong_12 itr #3999 Optimizing over 1000 iterations. 0% [##############################] 100% | ETA: 00:00:00 Total time elapsed: 00:00:04 2022-07-13 11:45:34.724956 | dreamer_pong_12 itr #4999 saving snapshot... 2022-07-13 11:45:34.750314 | dreamer_pong_12 itr #4999 saved 2022-07-13 11:45:34.760227 | ----------------------------- ----------- 2022-07-13 11:45:34.760293 | Diagnostics/NewCompletedTrajs 2 2022-07-13 11:45:34.760357 | Diagnostics/StepsInTrajWindow 5000 2022-07-13 11:45:34.760392 | Diagnostics/Iteration 4999 2022-07-13 11:45:34.760416 | Diagnostics/CumTime (s) 20.1201 2022-07-13 11:45:34.760459 | Diagnostics/CumSteps 5000 2022-07-13 11:45:34.760484 | Diagnostics/CumCompletedTrajs 10 2022-07-13 11:45:34.760526 | Diagnostics/CumUpdates 0 2022-07-13 11:45:34.760546 | Diagnostics/StepsPerSecond 243.455 2022-07-13 11:45:34.760592 | Diagnostics/UpdatesPerSecond 0 2022-07-13 11:45:34.760606 | Diagnostics/ReplayRatio 0 2022-07-13 11:45:34.760657 | Diagnostics/CumReplayRatio 0 2022-07-13 11:45:34.760673 | Length/Average 500 2022-07-13 11:45:34.760775 | Length/Std 0 2022-07-13 11:45:34.760817 | Length/Median 500 2022-07-13 11:45:34.760865 | Length/Min 500 2022-07-13 11:45:34.760880 | Length/Max 500 2022-07-13 11:45:34.760932 | Return/Average -5.5 2022-07-13 11:45:34.760946 | Return/Std 0.67082 2022-07-13 11:45:34.760997 | Return/Median -6 2022-07-13 11:45:34.761012 | Return/Min -6 2022-07-13 11:45:34.761064 | Return/Max -4 2022-07-13 11:45:34.761078 | NonzeroRewards/Average 5.5 2022-07-13 11:45:34.761119 | NonzeroRewards/Std 0.67082 2022-07-13 11:45:34.761131 | NonzeroRewards/Median 6 2022-07-13 11:45:34.761141 | NonzeroRewards/Min 4 2022-07-13 11:45:34.761160 | NonzeroRewards/Max 6 2022-07-13 11:45:34.761171 | DiscountedReturn/Average -0.490804 2022-07-13 11:45:34.761181 | DiscountedReturn/Std 0.15736 2022-07-13 11:45:34.761191 | DiscountedReturn/Median -0.563765 2022-07-13 11:45:34.761201 | DiscountedReturn/Min -0.632832 2022-07-13 11:45:34.761211 | DiscountedReturn/Max -0.117326 2022-07-13 11:45:34.761220 | GameScore/Average -5.5 2022-07-13 11:45:34.761230 | GameScore/Std 0.67082 2022-07-13 11:45:34.761240 | GameScore/Median -6 2022-07-13 11:45:34.761249 | GameScore/Min -6 2022-07-13 11:45:34.761259 | GameScore/Max -4 2022-07-13 11:45:34.761269 | loss/Average nan 2022-07-13 11:45:34.761278 | loss/Std nan 2022-07-13 11:45:34.761288 | loss/Median nan 2022-07-13 11:45:34.761298 | loss/Min nan 2022-07-13 11:45:34.761308 | loss/Max nan 2022-07-13 11:45:34.761317 | grad_norm_model/Average nan 2022-07-13 11:45:34.761327 | grad_norm_model/Std nan 2022-07-13 11:45:34.761336 | grad_norm_model/Median nan 2022-07-13 11:45:34.761346 | grad_norm_model/Min nan 2022-07-13 11:45:34.761356 | grad_norm_model/Max nan 2022-07-13 11:45:34.761365 | grad_norm_actor/Average nan 2022-07-13 11:45:34.761375 | grad_norm_actor/Std nan 2022-07-13 11:45:34.761385 | grad_norm_actor/Median nan 2022-07-13 11:45:34.761394 | grad_norm_actor/Min nan 2022-07-13 11:45:34.761404 | grad_norm_actor/Max nan 2022-07-13 11:45:34.761414 | grad_norm_value/Average nan 2022-07-13 11:45:34.761423 | grad_norm_value/Std nan 2022-07-13 11:45:34.761433 | grad_norm_value/Median nan 2022-07-13 11:45:34.761448 | grad_norm_value/Min nan 2022-07-13 11:45:34.761459 | grad_norm_value/Max nan 2022-07-13 11:45:34.761468 | model_loss/Average nan 2022-07-13 11:45:34.761478 | model_loss/Std nan 2022-07-13 11:45:34.761488 | model_loss/Median nan 2022-07-13 11:45:34.761498 | model_loss/Min nan 2022-07-13 11:45:34.761508 | model_loss/Max nan 2022-07-13 11:45:34.761517 | actor_loss/Average nan 2022-07-13 11:45:34.761527 | actor_loss/Std nan 2022-07-13 11:45:34.761537 | actor_loss/Median nan 2022-07-13 11:45:34.761546 | actor_loss/Min nan 2022-07-13 11:45:34.761556 | actor_loss/Max nan 2022-07-13 11:45:34.761566 | value_loss/Average nan 2022-07-13 11:45:34.761575 | value_loss/Std nan 2022-07-13 11:45:34.761585 | value_loss/Median nan 2022-07-13 11:45:34.761595 | value_loss/Min nan 2022-07-13 11:45:34.761605 | value_loss/Max nan 2022-07-13 11:45:34.761615 | prior_entropy/Average nan 2022-07-13 11:45:34.761624 | prior_entropy/Std nan 2022-07-13 11:45:34.761634 | prior_entropy/Median nan 2022-07-13 11:45:34.761644 | prior_entropy/Min nan 2022-07-13 11:45:34.761654 | prior_entropy/Max nan 2022-07-13 11:45:34.761663 | post_entropy/Average nan 2022-07-13 11:45:34.761673 | post_entropy/Std nan 2022-07-13 11:45:34.761683 | post_entropy/Median nan 2022-07-13 11:45:34.761692 | post_entropy/Min nan 2022-07-13 11:45:34.761702 | post_entropy/Max nan 2022-07-13 11:45:34.761712 | divergence/Average nan 2022-07-13 11:45:34.761721 | divergence/Std nan 2022-07-13 11:45:34.761731 | divergence/Median nan 2022-07-13 11:45:34.761741 | divergence/Min nan 2022-07-13 11:45:34.761751 | divergence/Max nan 2022-07-13 11:45:34.761760 | reward_loss/Average nan 2022-07-13 11:45:34.761770 | reward_loss/Std nan 2022-07-13 11:45:34.761780 | reward_loss/Median nan 2022-07-13 11:45:34.761789 | reward_loss/Min nan 2022-07-13 11:45:34.761799 | reward_loss/Max nan 2022-07-13 11:45:34.761809 | image_loss/Average nan 2022-07-13 11:45:34.761819 | image_loss/Std nan 2022-07-13 11:45:34.761828 | image_loss/Median nan 2022-07-13 11:45:34.761838 | image_loss/Min nan 2022-07-13 11:45:34.761848 | image_loss/Max nan 2022-07-13 11:45:34.761857 | pcont_loss/Average nan 2022-07-13 11:45:34.761867 | pcont_loss/Std nan 2022-07-13 11:45:34.761877 | pcont_loss/Median nan 2022-07-13 11:45:34.761886 | pcont_loss/Min nan 2022-07-13 11:45:34.761896 | pcont_loss/Max nan 2022-07-13 11:45:34.761906 | ----------------------------- ----------- 2022-07-13 11:45:34.761999 | dreamer_pong_12 itr #4999 Optimizing over 1000 iterations. Imagination: 0%| | 0/100 [00:05<?, ?it/s] Traceback (most recent call last): File "main.py", line 92, in build_and_train( File "main.py", line 65, in build_and_train runner.train() File "/home/uav-robot/anaconda3/envs/juliusfrost/lib/python3.8/site-packages/rlpyt/runners/minibatch_rl.py", line 259, in train opt_info = self.algo.optimize_agent(itr, samples) File "/home/uav-robot/MBRL/juliusfrost/dreamer-pytorch/dreamer/algos/dreamer_algo.py", line 147, in optimize_agent model_loss, actor_loss, value_loss, loss_info = self.loss(buffed_samples, itr, i) File "/home/uav-robot/MBRL/juliusfrost/dreamer-pytorch/dreamer/algos/dreamer_algo.py", line 232, in loss pcont_loss = -torch.mean(pcont_pred.log_prob(pcont_target)) File "/home/uav-robot/anaconda3/envs/juliusfrost/lib/python3.8/site-packages/torch/distributions/independent.py", line 95, in log_prob log_prob = self.base_dist.log_prob(value) File "/home/uav-robot/anaconda3/envs/juliusfrost/lib/python3.8/site-packages/torch/distributions/bernoulli.py", line 100, in log_prob self._validate_sample(value) File "/home/uav-robot/anaconda3/envs/juliusfrost/lib/python3.8/site-packages/torch/distributions/distribution.py", line 293, in _validate_sample raise ValueError( ValueError: Expected value argument (Tensor of shape (50, 50, 1)) to be within the support (Boolean()) of the distribution Bernoulli(logits: torch.Size([50, 50, 1])), but found invalid values: tensor([[[0.9900], [0.9900], [0.9900], ..., [0.9900], [0.9900], [0.9900]],

    [[0.9900],
     [0.9900],
     [0.9900],
     ...,
     [0.9900],
     [0.9900],
     [0.9900]],

    [[0.9900],
     [0.9900],
     [0.9900],
     ...,
     [0.9900],
     [0.9900],
     [0.9900]],

    ...,

    [[0.9900],
     [0.9900],
     [0.9900],
     ...,
     [0.9900],
     [0.9900],
     [0.9900]],

    [[0.9900],
     [0.9900],
     [0.9900],
     ...,
     [0.9900],
     [0.9900],
     [0.9900]],

    [[0.9900],
     [0.9900],
     [0.9900],
     ...,
     [0.9900],
     [0.9900],
     [0.9900]]])

and my env is

packages in environment at /home/uav-robot/anaconda3/envs/juliusfrost:

Name Version Build Channel

_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
absl-py 1.1.0 pypi_0 pypi ale-py 0.7.5 pypi_0 pypi atari-py 0.2.6 pypi_0 pypi attrs 21.4.0 pypi_0 pypi blas 1.0 mkl
brotlipy 0.7.0 py38h27cfd23_1003
bzip2 1.0.8 h7b6447c_0
ca-certificates 2022.4.26 h06a4308_0
cachetools 5.2.0 pypi_0 pypi certifi 2022.6.15 py38h06a4308_0
cffi 1.15.0 py38hd667e15_1
charset-normalizer 2.0.4 pyhd3eb1b0_0
cloudpickle 1.6.0 pypi_0 pypi cryptography 37.0.1 py38h9ce1e76_0
cudatoolkit 10.2.89 hfd86e86_1
cython 0.29.30 pypi_0 pypi dm-control 1.0.3.post1 pypi_0 pypi dm-env 1.5 pypi_0 pypi dm-tree 0.1.7 pypi_0 pypi fasteners 0.17.3 pypi_0 pypi ffmpeg 4.3 hf484d3e_0 pytorch freetype 2.11.0 h70c0345_0
giflib 5.2.1 h7b6447c_0
glfw 2.5.3 pypi_0 pypi gmp 6.2.1 h295c915_3
gnutls 3.6.15 he1e5248_0
google-auth 2.9.1 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi grpcio 1.47.0 pypi_0 pypi gym 0.19.0 pypi_0 pypi gym-notices 0.0.7 pypi_0 pypi idna 3.3 pyhd3eb1b0_0
imageio 2.19.3 pypi_0 pypi importlib-metadata 4.12.0 pypi_0 pypi importlib-resources 5.8.0 pypi_0 pypi iniconfig 1.1.1 pypi_0 pypi intel-openmp 2021.4.0 h06a4308_3561
jpeg 9e h7f8727e_0
labmaze 1.0.5 pypi_0 pypi lame 3.100 h7b6447c_0
lcms2 2.12 h3be6417_0
ld_impl_linux-64 2.38 h1181459_1
libffi 3.3 he6710b0_2
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libiconv 1.16 h7f8727e_2
libidn2 2.3.2 h7f8727e_0
libpng 1.6.37 hbc83047_0
libstdcxx-ng 11.2.0 h1234567_1
libtasn1 4.16.0 h27cfd23_0
libtiff 4.2.0 h2818925_1
libunistring 0.9.10 h27cfd23_0
libwebp 1.2.2 h55f646e_0
libwebp-base 1.2.2 h7f8727e_0
lxml 4.9.1 pypi_0 pypi lz4-c 1.9.3 h295c915_1
markdown 3.3.7 pypi_0 pypi mkl 2021.4.0 h06a4308_640
mkl-service 2.4.0 py38h7f8727e_0
mkl_fft 1.3.1 py38hd3c417c_0
mkl_random 1.2.2 py38h51133e4_0
mujoco 2.2.0 pypi_0 pypi mujoco-py 2.1.2.14 pypi_0 pypi ncurses 6.3 h5eee18b_3
nettle 3.7.3 hbbd107a_1
numpy 1.23.1 pypi_0 pypi numpy-base 1.22.3 py38hf524024_0
oauthlib 3.2.0 pypi_0 pypi opencv-python 4.6.0.66 pypi_0 pypi openh264 2.1.1 h4ff587b_0
openssl 1.1.1q h7f8727e_0
packaging 21.3 pypi_0 pypi pillow 9.0.1 py38h22f2fdc_0
pip 22.1.2 py38h06a4308_0
pluggy 1.0.0 pypi_0 pypi protobuf 3.20.1 pypi_0 pypi psutil 5.9.1 pypi_0 pypi py 1.11.0 pypi_0 pypi pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pycparser 2.21 pyhd3eb1b0_0
pyopengl 3.1.6 pypi_0 pypi pyopenssl 22.0.0 pyhd3eb1b0_0
pyparsing 2.4.7 pypi_0 pypi pyprind 2.11.3 pypi_0 pypi pysocks 1.7.1 py38h06a4308_0
pytest 7.1.2 pypi_0 pypi python 3.8.13 h12debd9_0
pytorch 1.12.0 py3.8_cuda10.2_cudnn7.6.5_0 pytorch pytorch-mutex 1.0 cuda pytorch readline 8.1.2 h7f8727e_1
requests 2.28.0 py38h06a4308_0
requests-oauthlib 1.3.1 pypi_0 pypi rlpyt 0.1.2 pypi_0 pypi rsa 4.8 pypi_0 pypi scipy 1.8.1 pypi_0 pypi setuptools 61.2.0 py38h06a4308_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.38.5 hc218d9a_0
tensorboard 2.9.1 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tk 8.6.12 h1ccaba5_0
tomli 2.0.1 pypi_0 pypi torch 1.12.0 pypi_0 pypi torchaudio 0.12.0 py38_cu102 pytorch torchvision 0.13.0 py38_cu102 pytorch tqdm 4.64.0 pypi_0 pypi typing-extensions 4.3.0 pypi_0 pypi typing_extensions 4.1.1 pyh06a4308_0
urllib3 1.26.9 py38h06a4308_0
werkzeug 2.1.2 pypi_0 pypi wheel 0.37.1 pyhd3eb1b0_0
xz 5.2.5 h7f8727e_1
zipp 3.8.1 pypi_0 pypi zlib 1.2.12 h7f8727e_2
zstd 1.5.2 ha4553b6_0

kabuwaniu avatar Jul 13 '22 03:07 kabuwaniu

I have the same issue, did anyone find a fix for this?

kennetms avatar Mar 13 '24 17:03 kennetms