rl icon indicating copy to clipboard operation
rl copied to clipboard

[Feature] Split-trajectories and represent as nested tensor

Open vmoens opened this issue 10 months ago • 3 comments

TODO:

  • [x] Doc

vmoens avatar Mar 27 '24 11:03 vmoens

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2043

Note: Links to docs will display an error until the docs builds have been completed.

:x: 5 New Failures

As of commit fedf22fa481515806cf9e5c49f021d203e95fdea with merge base 1083b35ef9733b2335bd88d587cb282e180267c4 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot[bot] avatar Mar 27 '24 11:03 pytorch-bot[bot]

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1177s 58.9363ms 16.9675 Ops/s 18.3733 Ops/s $\textbf{\color{#d91a1a}-7.65\%}$
test_sync 31.5695ms 30.6515ms 32.6249 Ops/s 29.7085 Ops/s $\textbf{\color{#35bf28}+9.82\%}$
test_async 46.6053ms 29.3146ms 34.1128 Ops/s 35.3836 Ops/s $\color{#d91a1a}-3.59\%$
test_simple 0.3800s 0.3790s 2.6388 Ops/s 2.6668 Ops/s $\color{#d91a1a}-1.05\%$
test_transformed 0.5327s 0.5278s 1.8948 Ops/s 1.8815 Ops/s $\color{#35bf28}+0.70\%$
test_serial 1.3411s 1.2790s 0.7818 Ops/s 0.7976 Ops/s $\color{#d91a1a}-1.97\%$
test_parallel 1.1584s 1.0889s 0.9184 Ops/s 0.9335 Ops/s $\color{#d91a1a}-1.62\%$
test_step_mdp_speed[True-True-True-True-True] 64.6610μs 22.6804μs 44.0908 KOps/s 44.3902 KOps/s $\color{#d91a1a}-0.67\%$
test_step_mdp_speed[True-True-True-True-False] 42.6700μs 13.2889μs 75.2506 KOps/s 74.4898 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[True-True-True-False-True] 38.0410μs 13.3258μs 75.0424 KOps/s 75.6954 KOps/s $\color{#d91a1a}-0.86\%$
test_step_mdp_speed[True-True-True-False-False] 29.0850μs 7.8911μs 126.7251 KOps/s 128.1920 KOps/s $\color{#d91a1a}-1.14\%$
test_step_mdp_speed[True-True-False-True-True] 71.1240μs 24.2011μs 41.3205 KOps/s 41.4742 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[True-True-False-True-False] 41.1270μs 14.5619μs 68.6722 KOps/s 67.8641 KOps/s $\color{#35bf28}+1.19\%$
test_step_mdp_speed[True-True-False-False-True] 36.6890μs 14.5813μs 68.5809 KOps/s 68.1302 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[True-True-False-False-False] 56.2950μs 9.1432μs 109.3713 KOps/s 110.0159 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[True-False-True-True-True] 67.1760μs 25.3777μs 39.4047 KOps/s 39.5498 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[True-False-True-True-False] 46.3470μs 15.8971μs 62.9047 KOps/s 62.0207 KOps/s $\color{#35bf28}+1.43\%$
test_step_mdp_speed[True-False-True-False-True] 42.2480μs 14.7273μs 67.9010 KOps/s 69.1462 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[True-False-True-False-False] 58.4060μs 9.1567μs 109.2101 KOps/s 110.1174 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[True-False-False-True-True] 59.3020μs 26.6697μs 37.4958 KOps/s 37.5527 KOps/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[True-False-False-True-False] 55.4640μs 17.1608μs 58.2725 KOps/s 57.4144 KOps/s $\color{#35bf28}+1.49\%$
test_step_mdp_speed[True-False-False-False-True] 48.0500μs 15.7957μs 63.3083 KOps/s 63.9247 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[True-False-False-False-False] 60.1020μs 10.3215μs 96.8853 KOps/s 97.4295 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[False-True-True-True-True] 61.6550μs 25.6210μs 39.0304 KOps/s 39.3234 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[False-True-True-True-False] 62.4970μs 15.9503μs 62.6946 KOps/s 61.5153 KOps/s $\color{#35bf28}+1.92\%$
test_step_mdp_speed[False-True-True-False-True] 56.5460μs 17.0852μs 58.5303 KOps/s 59.5770 KOps/s $\color{#d91a1a}-1.76\%$
test_step_mdp_speed[False-True-True-False-False] 44.2630μs 10.3670μs 96.4600 KOps/s 97.4146 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[False-True-False-True-True] 88.5060μs 26.7586μs 37.3711 KOps/s 37.4992 KOps/s $\color{#d91a1a}-0.34\%$
test_step_mdp_speed[False-True-False-True-False] 51.0450μs 17.1904μs 58.1722 KOps/s 58.2807 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[False-True-False-False-True] 0.1257ms 18.0186μs 55.4981 KOps/s 55.1389 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[False-True-False-False-False] 38.0410μs 11.6208μs 86.0524 KOps/s 87.2033 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[False-False-True-True-True] 60.1720μs 28.2893μs 35.3491 KOps/s 35.9096 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[False-False-True-True-False] 80.4410μs 18.7299μs 53.3906 KOps/s 53.5972 KOps/s $\color{#d91a1a}-0.39\%$
test_step_mdp_speed[False-False-True-False-True] 53.1300μs 18.1974μs 54.9528 KOps/s 54.8551 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[False-False-True-False-False] 74.7900μs 11.5849μs 86.3192 KOps/s 86.3038 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-False-False-True-True] 40.3260μs 29.5916μs 33.7934 KOps/s 33.8547 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[False-False-False-True-False] 76.9240μs 19.5963μs 51.0301 KOps/s 50.9637 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[False-False-False-False-True] 50.2430μs 19.1918μs 52.1056 KOps/s 52.2626 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[False-False-False-False-False] 70.1720μs 12.5586μs 79.6268 KOps/s 79.0255 KOps/s $\color{#35bf28}+0.76\%$
test_values[generalized_advantage_estimate-True-True] 10.1794ms 9.8567ms 101.4536 Ops/s 106.9063 Ops/s $\textbf{\color{#d91a1a}-5.10\%}$
test_values[vec_generalized_advantage_estimate-True-True] 37.9128ms 35.2077ms 28.4029 Ops/s 28.2846 Ops/s $\color{#35bf28}+0.42\%$
test_values[td0_return_estimate-False-False] 0.2802ms 0.1759ms 5.6851 KOps/s 5.9580 KOps/s $\color{#d91a1a}-4.58\%$
test_values[td1_return_estimate-False-False] 27.3622ms 24.4041ms 40.9768 Ops/s 42.8545 Ops/s $\color{#d91a1a}-4.38\%$
test_values[vec_td1_return_estimate-False-False] 36.8570ms 35.2215ms 28.3917 Ops/s 28.4969 Ops/s $\color{#d91a1a}-0.37\%$
test_values[td_lambda_return_estimate-True-False] 35.2837ms 34.3686ms 29.0963 Ops/s 29.6989 Ops/s $\color{#d91a1a}-2.03\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.3559ms 35.2005ms 28.4087 Ops/s 28.4779 Ops/s $\color{#d91a1a}-0.24\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.3994ms 8.5052ms 117.5755 Ops/s 120.6351 Ops/s $\color{#d91a1a}-2.54\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2766ms 1.9438ms 514.4468 Ops/s 548.2940 Ops/s $\textbf{\color{#d91a1a}-6.17\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5433ms 0.3575ms 2.7972 KOps/s 2.8465 KOps/s $\color{#d91a1a}-1.73\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.1113ms 46.7246ms 21.4020 Ops/s 21.5345 Ops/s $\color{#d91a1a}-0.62\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.6236ms 3.0659ms 326.1632 Ops/s 329.1358 Ops/s $\color{#d91a1a}-0.90\%$
test_dqn_speed 6.7809ms 1.3468ms 742.5062 Ops/s 760.2178 Ops/s $\color{#d91a1a}-2.33\%$
test_ddpg_speed 3.7809ms 2.8642ms 349.1335 Ops/s 357.2617 Ops/s $\color{#d91a1a}-2.28\%$
test_sac_speed 9.6893ms 8.7114ms 114.7921 Ops/s 115.2295 Ops/s $\color{#d91a1a}-0.38\%$
test_redq_speed 16.0194ms 14.0789ms 71.0282 Ops/s 72.7380 Ops/s $\color{#d91a1a}-2.35\%$
test_redq_deprec_speed 0.1098s 15.8387ms 63.1363 Ops/s 74.2895 Ops/s $\textbf{\color{#d91a1a}-15.01\%}$
test_td3_speed 9.3109ms 8.5299ms 117.2347 Ops/s 118.7561 Ops/s $\color{#d91a1a}-1.28\%$
test_cql_speed 37.9490ms 37.0601ms 26.9832 Ops/s 27.2012 Ops/s $\color{#d91a1a}-0.80\%$
test_a2c_speed 8.9949ms 7.5480ms 132.4858 Ops/s 134.0009 Ops/s $\color{#d91a1a}-1.13\%$
test_ppo_speed 9.0382ms 7.7944ms 128.2966 Ops/s 128.0878 Ops/s $\color{#35bf28}+0.16\%$
test_reinforce_speed 7.5464ms 6.7332ms 148.5175 Ops/s 148.8444 Ops/s $\color{#d91a1a}-0.22\%$
test_iql_speed 33.9415ms 32.9847ms 30.3171 Ops/s 30.6347 Ops/s $\color{#d91a1a}-1.04\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.8827ms 3.6417ms 274.5975 Ops/s 280.5338 Ops/s $\color{#d91a1a}-2.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8811ms 0.5118ms 1.9540 KOps/s 2.0330 KOps/s $\color{#d91a1a}-3.89\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.1158s 0.5446ms 1.8361 KOps/s 2.1393 KOps/s $\textbf{\color{#d91a1a}-14.17\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.0428ms 3.6020ms 277.6263 Ops/s 284.9831 Ops/s $\color{#d91a1a}-2.58\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9856ms 0.4965ms 2.0141 KOps/s 2.0469 KOps/s $\color{#d91a1a}-1.60\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7152ms 0.4691ms 2.1317 KOps/s 2.1311 KOps/s $\color{#35bf28}+0.03\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9284ms 1.7411ms 574.3502 Ops/s 585.5708 Ops/s $\color{#d91a1a}-1.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.4610ms 1.6527ms 605.0795 Ops/s 618.3040 Ops/s $\color{#d91a1a}-2.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.1398ms 3.7605ms 265.9232 Ops/s 266.2985 Ops/s $\color{#d91a1a}-0.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1373ms 0.6414ms 1.5592 KOps/s 1.6012 KOps/s $\color{#d91a1a}-2.62\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8847ms 0.6115ms 1.6354 KOps/s 1.6477 KOps/s $\color{#d91a1a}-0.74\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.9224ms 3.6566ms 273.4809 Ops/s 273.2645 Ops/s $\color{#35bf28}+0.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0165ms 0.5034ms 1.9863 KOps/s 2.0044 KOps/s $\color{#d91a1a}-0.90\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7725ms 0.4840ms 2.0660 KOps/s 2.1101 KOps/s $\color{#d91a1a}-2.09\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.6145ms 3.7768ms 264.7765 Ops/s 281.7038 Ops/s $\textbf{\color{#d91a1a}-6.01\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5905ms 0.4976ms 2.0095 KOps/s 2.0354 KOps/s $\color{#d91a1a}-1.27\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.8632ms 0.4846ms 2.0636 KOps/s 2.1277 KOps/s $\color{#d91a1a}-3.01\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.0916ms 3.8134ms 262.2308 Ops/s 269.1923 Ops/s $\color{#d91a1a}-2.59\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1724ms 0.6419ms 1.5579 KOps/s 1.5725 KOps/s $\color{#d91a1a}-0.93\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8348ms 0.6122ms 1.6335 KOps/s 1.5979 KOps/s $\color{#35bf28}+2.23\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1248s 8.3273ms 120.0869 Ops/s 115.9236 Ops/s $\color{#35bf28}+3.59\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 15.6027ms 12.7353ms 78.5221 Ops/s 79.6281 Ops/s $\color{#d91a1a}-1.39\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.4455ms 1.2317ms 811.9131 Ops/s 961.6608 Ops/s $\textbf{\color{#d91a1a}-15.57\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1044s 5.7754ms 173.1487 Ops/s 167.9057 Ops/s $\color{#35bf28}+3.12\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 15.1134ms 12.7796ms 78.2498 Ops/s 80.4645 Ops/s $\color{#d91a1a}-2.75\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.7512ms 1.0750ms 930.2172 Ops/s 901.3093 Ops/s $\color{#35bf28}+3.21\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1130s 6.1175ms 163.4648 Ops/s 167.5368 Ops/s $\color{#d91a1a}-2.43\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 15.3955ms 12.8828ms 77.6227 Ops/s 79.1766 Ops/s $\color{#d91a1a}-1.96\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.7104ms 1.2235ms 817.3240 Ops/s 837.3199 Ops/s $\color{#d91a1a}-2.39\%$

github-actions[bot] avatar Mar 27 '24 11:03 github-actions[bot]

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1176s 0.1173s 8.5286 Ops/s 8.6411 Ops/s $\color{#d91a1a}-1.30\%$
test_sync 0.1044s 0.1020s 9.7994 Ops/s 9.6976 Ops/s $\color{#35bf28}+1.05\%$
test_async 0.1865s 93.3000ms 10.7181 Ops/s 10.1942 Ops/s $\textbf{\color{#35bf28}+5.14\%}$
test_single_pixels 0.1286s 0.1284s 7.7877 Ops/s 7.8551 Ops/s $\color{#d91a1a}-0.86\%$
test_sync_pixels 85.2218ms 81.0271ms 12.3416 Ops/s 12.2743 Ops/s $\color{#35bf28}+0.55\%$
test_async_pixels 0.1592s 68.3696ms 14.6264 Ops/s 14.1283 Ops/s $\color{#35bf28}+3.53\%$
test_simple 0.8864s 0.8225s 1.2157 Ops/s 1.2485 Ops/s $\color{#d91a1a}-2.63\%$
test_transformed 1.1363s 1.0760s 0.9293 Ops/s 0.9378 Ops/s $\color{#d91a1a}-0.91\%$
test_serial 2.5594s 2.5035s 0.3994 Ops/s 0.4053 Ops/s $\color{#d91a1a}-1.44\%$
test_parallel 2.4174s 2.3687s 0.4222 Ops/s 0.4213 Ops/s $\color{#35bf28}+0.21\%$
test_step_mdp_speed[True-True-True-True-True] 81.4730μs 34.9127μs 28.6429 KOps/s 29.6697 KOps/s $\color{#d91a1a}-3.46\%$
test_step_mdp_speed[True-True-True-True-False] 43.5710μs 19.9456μs 50.1363 KOps/s 50.5212 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[True-True-True-False-True] 44.0710μs 19.7299μs 50.6845 KOps/s 51.7963 KOps/s $\color{#d91a1a}-2.15\%$
test_step_mdp_speed[True-True-True-False-False] 43.6920μs 11.3212μs 88.3297 KOps/s 89.1527 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[True-True-False-True-True] 81.4140μs 36.4597μs 27.4275 KOps/s 28.2447 KOps/s $\color{#d91a1a}-2.89\%$
test_step_mdp_speed[True-True-False-True-False] 42.3420μs 22.0446μs 45.3625 KOps/s 46.0232 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[True-True-False-False-True] 45.5810μs 21.5452μs 46.4140 KOps/s 47.2947 KOps/s $\color{#d91a1a}-1.86\%$
test_step_mdp_speed[True-True-False-False-False] 32.3020μs 13.1193μs 76.2234 KOps/s 77.3478 KOps/s $\color{#d91a1a}-1.45\%$
test_step_mdp_speed[True-False-True-True-True] 65.2320μs 38.4936μs 25.9783 KOps/s 26.6012 KOps/s $\color{#d91a1a}-2.34\%$
test_step_mdp_speed[True-False-True-True-False] 44.5720μs 23.6613μs 42.2632 KOps/s 42.5413 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[True-False-True-False-True] 87.7330μs 21.5618μs 46.3784 KOps/s 47.4728 KOps/s $\color{#d91a1a}-2.31\%$
test_step_mdp_speed[True-False-True-False-False] 36.2210μs 13.1605μs 75.9850 KOps/s 76.6381 KOps/s $\color{#d91a1a}-0.85\%$
test_step_mdp_speed[True-False-False-True-True] 64.2230μs 39.7497μs 25.1574 KOps/s 25.3262 KOps/s $\color{#d91a1a}-0.67\%$
test_step_mdp_speed[True-False-False-True-False] 58.1620μs 25.3218μs 39.4917 KOps/s 40.2167 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[True-False-False-False-True] 44.9020μs 23.0887μs 43.3113 KOps/s 44.5463 KOps/s $\color{#d91a1a}-2.77\%$
test_step_mdp_speed[True-False-False-False-False] 41.0310μs 15.0042μs 66.6480 KOps/s 67.2305 KOps/s $\color{#d91a1a}-0.87\%$
test_step_mdp_speed[False-True-True-True-True] 0.1196ms 38.3130μs 26.1008 KOps/s 27.1732 KOps/s $\color{#d91a1a}-3.95\%$
test_step_mdp_speed[False-True-True-True-False] 46.3920μs 23.8450μs 41.9376 KOps/s 42.8070 KOps/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[False-True-True-False-True] 49.3720μs 25.6041μs 39.0562 KOps/s 40.4646 KOps/s $\color{#d91a1a}-3.48\%$
test_step_mdp_speed[False-True-True-False-False] 41.4420μs 14.9649μs 66.8228 KOps/s 67.9087 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[False-True-False-True-True] 66.0020μs 39.8486μs 25.0950 KOps/s 25.6664 KOps/s $\color{#d91a1a}-2.23\%$
test_step_mdp_speed[False-True-False-True-False] 52.5420μs 25.3191μs 39.4958 KOps/s 39.8862 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[False-True-False-False-True] 51.1220μs 27.4297μs 36.4568 KOps/s 37.0942 KOps/s $\color{#d91a1a}-1.72\%$
test_step_mdp_speed[False-True-False-False-False] 37.6920μs 16.7233μs 59.7967 KOps/s 60.5128 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[False-False-True-True-True] 70.3820μs 42.2538μs 23.6665 KOps/s 24.2283 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[False-False-True-True-False] 47.2110μs 27.1649μs 36.8122 KOps/s 37.0314 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[False-False-True-False-True] 54.1920μs 27.4027μs 36.4928 KOps/s 37.3710 KOps/s $\color{#d91a1a}-2.35\%$
test_step_mdp_speed[False-False-True-False-False] 36.9320μs 16.6156μs 60.1844 KOps/s 60.6175 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[False-False-False-True-True] 75.9230μs 44.2131μs 22.6177 KOps/s 22.6856 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[False-False-False-True-False] 53.8920μs 28.9919μs 34.4924 KOps/s 34.2442 KOps/s $\color{#35bf28}+0.72\%$
test_step_mdp_speed[False-False-False-False-True] 52.9430μs 28.8510μs 34.6608 KOps/s 35.2580 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[False-False-False-False-False] 38.7810μs 18.6746μs 53.5487 KOps/s 54.9798 KOps/s $\color{#d91a1a}-2.60\%$
test_values[generalized_advantage_estimate-True-True] 25.2021ms 24.8311ms 40.2720 Ops/s 39.7264 Ops/s $\color{#35bf28}+1.37\%$
test_values[vec_generalized_advantage_estimate-True-True] 88.3040ms 2.6594ms 376.0255 Ops/s 366.2518 Ops/s $\color{#35bf28}+2.67\%$
test_values[td0_return_estimate-False-False] 98.8340μs 67.4336μs 14.8294 KOps/s 14.8048 KOps/s $\color{#35bf28}+0.17\%$
test_values[td1_return_estimate-False-False] 56.6738ms 55.5031ms 18.0170 Ops/s 17.7036 Ops/s $\color{#35bf28}+1.77\%$
test_values[vec_td1_return_estimate-False-False] 1.4084ms 1.0807ms 925.3255 Ops/s 921.6397 Ops/s $\color{#35bf28}+0.40\%$
test_values[td_lambda_return_estimate-True-False] 89.3289ms 87.6255ms 11.4122 Ops/s 11.2256 Ops/s $\color{#35bf28}+1.66\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4349ms 1.0764ms 929.0151 Ops/s 923.6470 Ops/s $\color{#35bf28}+0.58\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.0212ms 24.8030ms 40.3177 Ops/s 39.6899 Ops/s $\color{#35bf28}+1.58\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9848ms 0.7359ms 1.3590 KOps/s 1.3764 KOps/s $\color{#d91a1a}-1.27\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8133ms 0.6676ms 1.4978 KOps/s 1.4892 KOps/s $\color{#35bf28}+0.58\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6255ms 1.4681ms 681.1580 Ops/s 678.9041 Ops/s $\color{#35bf28}+0.33\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8253ms 0.6812ms 1.4680 KOps/s 1.4567 KOps/s $\color{#35bf28}+0.77\%$
test_dqn_speed 1.7603ms 1.4728ms 678.9696 Ops/s 685.7335 Ops/s $\color{#d91a1a}-0.99\%$
test_ddpg_speed 3.3497ms 3.0052ms 332.7536 Ops/s 336.4349 Ops/s $\color{#d91a1a}-1.09\%$
test_sac_speed 9.1082ms 8.5960ms 116.3328 Ops/s 117.6222 Ops/s $\color{#d91a1a}-1.10\%$
test_redq_speed 0.1069s 12.1196ms 82.5108 Ops/s 92.0605 Ops/s $\textbf{\color{#d91a1a}-10.37\%}$
test_redq_deprec_speed 12.4419ms 11.7893ms 84.8226 Ops/s 84.4612 Ops/s $\color{#35bf28}+0.43\%$
test_td3_speed 8.7309ms 8.5920ms 116.3880 Ops/s 117.1166 Ops/s $\color{#d91a1a}-0.62\%$
test_cql_speed 27.0314ms 26.1488ms 38.2426 Ops/s 38.4593 Ops/s $\color{#d91a1a}-0.56\%$
test_a2c_speed 6.0498ms 5.8214ms 171.7797 Ops/s 172.9682 Ops/s $\color{#d91a1a}-0.69\%$
test_ppo_speed 6.4352ms 6.1271ms 163.2092 Ops/s 163.7845 Ops/s $\color{#d91a1a}-0.35\%$
test_reinforce_speed 5.5939ms 4.7160ms 212.0442 Ops/s 207.6656 Ops/s $\color{#35bf28}+2.11\%$
test_iql_speed 20.5470ms 19.9865ms 50.0337 Ops/s 49.6300 Ops/s $\color{#35bf28}+0.81\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.8105ms 4.6420ms 215.4254 Ops/s 213.5458 Ops/s $\color{#35bf28}+0.88\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.1099s 0.6996ms 1.4294 KOps/s 1.6726 KOps/s $\textbf{\color{#d91a1a}-14.54\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7821ms 0.5864ms 1.7053 KOps/s 1.7402 KOps/s $\color{#d91a1a}-2.00\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.9853ms 4.6096ms 216.9374 Ops/s 217.0073 Ops/s $\color{#d91a1a}-0.03\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2792ms 0.5996ms 1.6677 KOps/s 1.6956 KOps/s $\color{#d91a1a}-1.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7616ms 0.5737ms 1.7431 KOps/s 1.7679 KOps/s $\color{#d91a1a}-1.41\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.7313ms 2.1553ms 463.9803 Ops/s 466.5706 Ops/s $\color{#d91a1a}-0.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2744ms 2.0619ms 484.9785 Ops/s 486.2744 Ops/s $\color{#d91a1a}-0.27\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5436ms 4.8277ms 207.1364 Ops/s 209.4074 Ops/s $\color{#d91a1a}-1.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9844ms 0.7568ms 1.3214 KOps/s 1.3367 KOps/s $\color{#d91a1a}-1.15\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.6345ms 0.7344ms 1.3617 KOps/s 1.3771 KOps/s $\color{#d91a1a}-1.12\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.8444ms 4.6798ms 213.6837 Ops/s 215.0638 Ops/s $\color{#d91a1a}-0.64\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7430ms 0.6108ms 1.6372 KOps/s 1.6706 KOps/s $\color{#d91a1a}-2.00\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7716ms 0.5902ms 1.6943 KOps/s 1.7472 KOps/s $\color{#d91a1a}-3.03\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.9094ms 4.6319ms 215.8931 Ops/s 218.3186 Ops/s $\color{#d91a1a}-1.11\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7928ms 0.6032ms 1.6579 KOps/s 1.6991 KOps/s $\color{#d91a1a}-2.42\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.1460s 0.8157ms 1.2260 KOps/s 1.7468 KOps/s $\textbf{\color{#d91a1a}-29.81\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0297ms 4.8186ms 207.5310 Ops/s 209.6702 Ops/s $\color{#d91a1a}-1.02\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9552ms 0.7624ms 1.3116 KOps/s 1.3381 KOps/s $\color{#d91a1a}-1.98\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9191ms 0.7403ms 1.3509 KOps/s 1.3751 KOps/s $\color{#d91a1a}-1.76\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1324s 7.5654ms 132.1816 Ops/s 132.8713 Ops/s $\color{#d91a1a}-0.52\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 18.6826ms 16.1127ms 62.0629 Ops/s 63.0198 Ops/s $\color{#d91a1a}-1.52\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.4540ms 1.3174ms 759.0558 Ops/s 777.1983 Ops/s $\color{#d91a1a}-2.33\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1296s 7.5428ms 132.5770 Ops/s 133.4412 Ops/s $\color{#d91a1a}-0.65\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1442s 18.6245ms 53.6927 Ops/s 54.5586 Ops/s $\color{#d91a1a}-1.59\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.6811ms 1.4398ms 694.5408 Ops/s 705.5199 Ops/s $\color{#d91a1a}-1.56\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1318s 7.7436ms 129.1395 Ops/s 130.9102 Ops/s $\color{#d91a1a}-1.35\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 18.5651ms 16.3021ms 61.3418 Ops/s 62.8656 Ops/s $\color{#d91a1a}-2.42\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.5999ms 1.5263ms 655.1975 Ops/s 672.5030 Ops/s $\color{#d91a1a}-2.57\%$

github-actions[bot] avatar Mar 27 '24 11:03 github-actions[bot]

cc @jbschlosser interestingly, padding a nested tensor seems to be faster than padding a bunch of non-contiguous tensors!

vmoens avatar Jun 28 '24 08:06 vmoens