rl
rl copied to clipboard
[Algorithm] Update scripts with compile
:link: Helpful Links
:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2448
- :page_facing_up: Preview Python docs built from this PR
Note: Links to docs will display an error until the docs builds have been completed.
:x: 13 New Failures, 9 Unrelated Failures
As of commit 8332774dc5a68970912d8a3df785fafd8669975f with merge base e294c68ca8ac1794b19398b07a1cc42cca586ea1 ():
NEW FAILURES - The following jobs have failed:
- Examples Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 03657b5129be6c9d39d0586e3aa96391f0257ea0016d401a315f97f41f6eeaed /exec failed with exit code 1 - Habitat Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 83ff629dc68b540ca661ca22a10f09c3979c5059bca3d67bd7c0f62814560982 /exec failed with exit code 134 - Lint / python-source-and-configs / linux-job (gh)
torchrl/modules/distributions/utils.py:15:5: F401 'torch._dynamo.is_compiling as is_dynamo_compiling' imported but unused - Unit-tests on Linux / tests-cpu (3.10) / linux-job (gh)
test/test_tensordictmodules.py::TestLSTMModule::test_noncontiguous - Unit-tests on Linux / tests-cpu (3.11) / linux-job (gh)
test/test_tensordictmodules.py::TestLSTMModule::test_noncontiguous - Unit-tests on Linux / tests-cpu (3.12) / linux-job (gh)
test/test_tensordictmodules.py::TestLSTMModule::test_noncontiguous - Unit-tests on Linux / tests-cpu (3.9) / linux-job (gh)
test/test_tensordictmodules.py::TestLSTMModule::test_noncontiguous - Unit-tests on Linux / tests-cpu-oldget (3.12) / linux-job (gh)
test/test_tensordictmodules.py::TestLSTMModule::test_noncontiguous - Unit-tests on Linux / tests-gpu (3.11, 12.1) / linux-job (gh)
test/test_distributions.py::TestTanhNormal::test_tanhnormal[device0-shape1-3-vecs0-0.1--0.1] - Unit-tests on Linux / tests-optdeps (3.10, 12.1) / linux-job (gh)
test/test_distributions.py::TestTanhNormal::test_tanhnormal[device0-shape1-3-vecs0-0.1--0.1] - Unit-tests on Linux / tests-stable-gpu (3.10, 11.8) / linux-job (gh)
test/test_distributions.py::TestTanhNormal::test_tanhnormal[device0-shape1-3-vecs0-0.1--0.1] - Unit-tests on Windows / unittests-cpu / windows-job (gh)
test/test_distributions.py::TestTanhNormal::test_tanhnormal[device0-shape1-3-vecs0-0.1--0.1] - Wheels / build-wheel-windows (3.10, 3.10.3) (gh)
FLAKY - The following job failed but was likely due to flakiness present on trunk:
- Continuous Benchmark (PR) / GPU Pytest benchmark (gh) (detected as infra flaky with no log or failing log classifier)
BROKEN TRUNK - The following jobs failed but was present on the merge base:
👉 Rebase onto the `viable/strict` branch to avoid these failures
- Build Windows Wheels / pytorch/rl (pytorch/rl, .github/scripts/td_script.sh, .github/scripts/version_script.bat, python ... / upload / wheel-py3_9-cpu (gh) (trunk failure)
- Build Windows Wheels / pytorch/rl (pytorch/rl, .github/scripts/td_script.sh, .github/scripts/version_script.bat, python ... / upload / wheel-py3_9-cuda11_8 (gh) (trunk failure)
- Build Windows Wheels / pytorch/rl (pytorch/rl, .github/scripts/td_script.sh, .github/scripts/version_script.bat, python ... / upload / wheel-py3_9-cuda12_1 (gh) (trunk failure)
- Build Windows Wheels / pytorch/rl (pytorch/rl, .github/scripts/td_script.sh, .github/scripts/version_script.bat, python ... / upload / wheel-py3_9-cuda12_4 (gh) (trunk failure)
- Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job (gh) (trunk failure)
test/test_transforms.py::TestKLRewardTransform::test_kl_lstm - Wheels / build-wheel-windows (3.11, 3.11) (gh) (trunk failure)
- Wheels / build-wheel-windows (3.12, 3.12) (gh) (trunk failure)
- Wheels / build-wheel-windows (3.9, 3.9) (gh) (trunk failure)
This comment was automatically generated by Dr. CI and updates every 15 minutes.
$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests
Total Benchmarks: 146. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}4$.
Expand to view detailed results
| Name | Max | Mean | Ops | Ops on Repo HEAD |
Change |
|---|---|---|---|---|---|
| test_single | 63.0982ms | 60.8771ms | 16.4265 Ops/s | 16.5862 Ops/s | $\color{#d91a1a}-0.96\%$ |
| test_sync | 39.2815ms | 33.8370ms | 29.5534 Ops/s | 27.9910 Ops/s | $\textbf{\color{#35bf28}+5.58\%}$ |
| test_async | 0.1551s | 32.8933ms | 30.4014 Ops/s | 30.8785 Ops/s | $\color{#d91a1a}-1.55\%$ |
| test_simple | 0.5395s | 0.4411s | 2.2670 Ops/s | 2.3994 Ops/s | $\textbf{\color{#d91a1a}-5.52\%}$ |
| test_transformed | 0.5913s | 0.5878s | 1.7012 Ops/s | 1.7363 Ops/s | $\color{#d91a1a}-2.02\%$ |
| test_serial | 1.2923s | 1.2868s | 0.7771 Ops/s | 0.7715 Ops/s | $\color{#35bf28}+0.72\%$ |
| test_parallel | 1.2650s | 1.1695s | 0.8550 Ops/s | 0.8538 Ops/s | $\color{#35bf28}+0.15\%$ |
| test_step_mdp_speed[True-True-True-True-True] | 0.2930ms | 27.8672μs | 35.8845 KOps/s | 36.7223 KOps/s | $\color{#d91a1a}-2.28\%$ |
| test_step_mdp_speed[True-True-True-True-False] | 50.4540μs | 16.1104μs | 62.0715 KOps/s | 62.0593 KOps/s | $\color{#35bf28}+0.02\%$ |
| test_step_mdp_speed[True-True-True-False-True] | 53.7010μs | 15.7806μs | 63.3688 KOps/s | 63.6925 KOps/s | $\color{#d91a1a}-0.51\%$ |
| test_step_mdp_speed[True-True-True-False-False] | 45.5050μs | 9.2175μs | 108.4892 KOps/s | 108.7100 KOps/s | $\color{#d91a1a}-0.20\%$ |
| test_step_mdp_speed[True-True-False-True-True] | 91.7300μs | 28.9851μs | 34.5005 KOps/s | 34.6111 KOps/s | $\color{#d91a1a}-0.32\%$ |
| test_step_mdp_speed[True-True-False-True-False] | 59.4220μs | 17.7436μs | 56.3584 KOps/s | 56.7422 KOps/s | $\color{#d91a1a}-0.68\%$ |
| test_step_mdp_speed[True-True-False-False-True] | 58.6200μs | 17.6105μs | 56.7844 KOps/s | 57.6062 KOps/s | $\color{#d91a1a}-1.43\%$ |
| test_step_mdp_speed[True-True-False-False-False] | 68.7790μs | 10.8896μs | 91.8311 KOps/s | 92.0403 KOps/s | $\color{#d91a1a}-0.23\%$ |
| test_step_mdp_speed[True-False-True-True-True] | 70.1020μs | 31.2498μs | 32.0002 KOps/s | 32.3878 KOps/s | $\color{#d91a1a}-1.20\%$ |
| test_step_mdp_speed[True-False-True-True-False] | 51.2760μs | 19.4126μs | 51.5130 KOps/s | 52.0901 KOps/s | $\color{#d91a1a}-1.11\%$ |
| test_step_mdp_speed[True-False-True-False-True] | 56.5660μs | 17.3107μs | 57.7677 KOps/s | 57.7907 KOps/s | $\color{#d91a1a}-0.04\%$ |
| test_step_mdp_speed[True-False-True-False-False] | 59.4810μs | 10.8136μs | 92.4763 KOps/s | 92.4532 KOps/s | $\color{#35bf28}+0.02\%$ |
| test_step_mdp_speed[True-False-False-True-True] | 78.7970μs | 32.4485μs | 30.8180 KOps/s | 31.1154 KOps/s | $\color{#d91a1a}-0.96\%$ |
| test_step_mdp_speed[True-False-False-True-False] | 63.2680μs | 20.9227μs | 47.7950 KOps/s | 48.3141 KOps/s | $\color{#d91a1a}-1.07\%$ |
| test_step_mdp_speed[True-False-False-False-True] | 55.8440μs | 18.9185μs | 52.8584 KOps/s | 53.5534 KOps/s | $\color{#d91a1a}-1.30\%$ |
| test_step_mdp_speed[True-False-False-False-False] | 47.4080μs | 12.3576μs | 80.9216 KOps/s | 81.1539 KOps/s | $\color{#d91a1a}-0.29\%$ |
| test_step_mdp_speed[False-True-True-True-True] | 88.8560μs | 30.9484μs | 32.3119 KOps/s | 32.8107 KOps/s | $\color{#d91a1a}-1.52\%$ |
| test_step_mdp_speed[False-True-True-True-False] | 76.9140μs | 19.2540μs | 51.9372 KOps/s | 51.8460 KOps/s | $\color{#35bf28}+0.18\%$ |
| test_step_mdp_speed[False-True-True-False-True] | 58.4290μs | 20.0504μs | 49.8742 KOps/s | 50.6705 KOps/s | $\color{#d91a1a}-1.57\%$ |
| test_step_mdp_speed[False-True-True-False-False] | 40.5960μs | 12.1495μs | 82.3078 KOps/s | 83.6306 KOps/s | $\color{#d91a1a}-1.58\%$ |
| test_step_mdp_speed[False-True-False-True-True] | 95.0580μs | 32.8210μs | 30.4683 KOps/s | 31.0632 KOps/s | $\color{#d91a1a}-1.92\%$ |
| test_step_mdp_speed[False-True-False-True-False] | 60.8440μs | 20.9259μs | 47.7877 KOps/s | 48.4230 KOps/s | $\color{#d91a1a}-1.31\%$ |
| test_step_mdp_speed[False-True-False-False-True] | 2.9076ms | 21.7155μs | 46.0501 KOps/s | 46.9146 KOps/s | $\color{#d91a1a}-1.84\%$ |
| test_step_mdp_speed[False-True-False-False-False] | 45.0050μs | 13.7031μs | 72.9764 KOps/s | 73.5932 KOps/s | $\color{#d91a1a}-0.84\%$ |
| test_step_mdp_speed[False-False-True-True-True] | 0.1099ms | 34.3271μs | 29.1315 KOps/s | 29.5043 KOps/s | $\color{#d91a1a}-1.26\%$ |
| test_step_mdp_speed[False-False-True-True-False] | 69.3800μs | 22.5878μs | 44.2717 KOps/s | 44.7539 KOps/s | $\color{#d91a1a}-1.08\%$ |
| test_step_mdp_speed[False-False-True-False-True] | 54.6720μs | 21.5835μs | 46.3316 KOps/s | 47.0621 KOps/s | $\color{#d91a1a}-1.55\%$ |
| test_step_mdp_speed[False-False-True-False-False] | 70.9130μs | 13.6117μs | 73.4663 KOps/s | 73.1099 KOps/s | $\color{#35bf28}+0.49\%$ |
| test_step_mdp_speed[False-False-False-True-True] | 82.8750μs | 35.8160μs | 27.9205 KOps/s | 29.0443 KOps/s | $\color{#d91a1a}-3.87\%$ |
| test_step_mdp_speed[False-False-False-True-False] | 69.2090μs | 24.3513μs | 41.0655 KOps/s | 42.6348 KOps/s | $\color{#d91a1a}-3.68\%$ |
| test_step_mdp_speed[False-False-False-False-True] | 57.3870μs | 22.5613μs | 44.3238 KOps/s | 45.1005 KOps/s | $\color{#d91a1a}-1.72\%$ |
| test_step_mdp_speed[False-False-False-False-False] | 48.9020μs | 15.0278μs | 66.5434 KOps/s | 66.6751 KOps/s | $\color{#d91a1a}-0.20\%$ |
| test_values[generalized_advantage_estimate-True-True] | 10.0370ms | 9.7453ms | 102.6138 Ops/s | 105.4208 Ops/s | $\color{#d91a1a}-2.66\%$ |
| test_values[vec_generalized_advantage_estimate-True-True] | 40.4849ms | 34.1174ms | 29.3106 Ops/s | 27.7370 Ops/s | $\textbf{\color{#35bf28}+5.67\%}$ |
| test_values[td0_return_estimate-False-False] | 0.2673ms | 0.1947ms | 5.1370 KOps/s | 5.3138 KOps/s | $\color{#d91a1a}-3.33\%$ |
| test_values[td1_return_estimate-False-False] | 28.4074ms | 24.5906ms | 40.6660 Ops/s | 41.9834 Ops/s | $\color{#d91a1a}-3.14\%$ |
| test_values[vec_td1_return_estimate-False-False] | 39.8496ms | 34.6192ms | 28.8857 Ops/s | 26.9783 Ops/s | $\textbf{\color{#35bf28}+7.07\%}$ |
| test_values[td_lambda_return_estimate-True-False] | 39.2367ms | 35.2777ms | 28.3465 Ops/s | 29.2212 Ops/s | $\color{#d91a1a}-2.99\%$ |
| test_values[vec_td_lambda_return_estimate-True-False] | 36.0782ms | 34.0455ms | 29.3725 Ops/s | 27.5102 Ops/s | $\textbf{\color{#35bf28}+6.77\%}$ |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.9803ms | 8.2834ms | 120.7241 Ops/s | 121.6934 Ops/s | $\color{#d91a1a}-0.80\%$ |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.7578ms | 2.0223ms | 494.4938 Ops/s | 548.5600 Ops/s | $\textbf{\color{#d91a1a}-9.86\%}$ |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4553ms | 0.3601ms | 2.7774 KOps/s | 2.8014 KOps/s | $\color{#d91a1a}-0.86\%$ |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 47.1371ms | 45.4656ms | 21.9946 Ops/s | 22.7982 Ops/s | $\color{#d91a1a}-3.52\%$ |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 4.1907ms | 3.1517ms | 317.2885 Ops/s | 324.0887 Ops/s | $\color{#d91a1a}-2.10\%$ |
| test_dqn_speed[False-None] | 6.8903ms | 1.3703ms | 729.7540 Ops/s | 752.4206 Ops/s | $\color{#d91a1a}-3.01\%$ |
| test_dqn_speed[False-backward] | 1.9346ms | 1.8709ms | 534.4934 Ops/s | 544.0747 Ops/s | $\color{#d91a1a}-1.76\%$ |
| test_dqn_speed[True-None] | 0.7568ms | 0.4679ms | 2.1374 KOps/s | 2.1205 KOps/s | $\color{#35bf28}+0.80\%$ |
| test_dqn_speed[True-backward] | 0.9898ms | 0.8923ms | 1.1207 KOps/s | 1.1231 KOps/s | $\color{#d91a1a}-0.21\%$ |
| test_dqn_speed[reduce-overhead-None] | 0.6506ms | 0.4709ms | 2.1238 KOps/s | 2.1179 KOps/s | $\color{#35bf28}+0.28\%$ |
| test_dqn_speed[reduce-overhead-backward] | 0.9636ms | 0.8846ms | 1.1305 KOps/s | 1.1308 KOps/s | $\color{#d91a1a}-0.03\%$ |
| test_ddpg_speed[False-None] | 4.1066ms | 2.8177ms | 354.9032 Ops/s | 356.5173 Ops/s | $\color{#d91a1a}-0.45\%$ |
| test_ddpg_speed[False-backward] | 4.2446ms | 4.0872ms | 244.6686 Ops/s | 246.4540 Ops/s | $\color{#d91a1a}-0.72\%$ |
| test_ddpg_speed[True-None] | 1.3684ms | 1.0277ms | 973.0223 Ops/s | 965.7300 Ops/s | $\color{#35bf28}+0.76\%$ |
| test_ddpg_speed[True-backward] | 2.0227ms | 1.9411ms | 515.1630 Ops/s | 508.8626 Ops/s | $\color{#35bf28}+1.24\%$ |
| test_ddpg_speed[reduce-overhead-None] | 1.6172ms | 1.0328ms | 968.2313 Ops/s | 957.3714 Ops/s | $\color{#35bf28}+1.13\%$ |
| test_ddpg_speed[reduce-overhead-backward] | 2.0684ms | 1.9519ms | 512.3293 Ops/s | 521.2359 Ops/s | $\color{#d91a1a}-1.71\%$ |
| test_sac_speed[False-None] | 9.9844ms | 8.1218ms | 123.1257 Ops/s | 119.8910 Ops/s | $\color{#35bf28}+2.70\%$ |
| test_sac_speed[False-backward] | 12.8058ms | 11.1651ms | 89.5644 Ops/s | 82.8973 Ops/s | $\textbf{\color{#35bf28}+8.04\%}$ |
| test_sac_speed[True-None] | 2.3364ms | 1.9029ms | 525.5270 Ops/s | 521.9624 Ops/s | $\color{#35bf28}+0.68\%$ |
| test_sac_speed[True-backward] | 4.4950ms | 3.6758ms | 272.0496 Ops/s | 274.1971 Ops/s | $\color{#d91a1a}-0.78\%$ |
| test_sac_speed[reduce-overhead-None] | 2.4700ms | 1.9043ms | 525.1151 Ops/s | 517.8199 Ops/s | $\color{#35bf28}+1.41\%$ |
| test_sac_speed[reduce-overhead-backward] | 4.1530ms | 3.7261ms | 268.3736 Ops/s | 267.4705 Ops/s | $\color{#35bf28}+0.34\%$ |
| test_redq_speed[False-None] | 14.6515ms | 13.4854ms | 74.1543 Ops/s | 73.8150 Ops/s | $\color{#35bf28}+0.46\%$ |
| test_redq_speed[False-backward] | 24.6288ms | 23.1483ms | 43.1998 Ops/s | 43.5792 Ops/s | $\color{#d91a1a}-0.87\%$ |
| test_redq_speed[True-None] | 5.9406ms | 4.9390ms | 202.4702 Ops/s | 195.3230 Ops/s | $\color{#35bf28}+3.66\%$ |
| test_redq_speed[True-backward] | 13.0217ms | 12.4503ms | 80.3196 Ops/s | 74.3725 Ops/s | $\textbf{\color{#35bf28}+8.00\%}$ |
| test_redq_speed[reduce-overhead-None] | 6.6449ms | 5.1660ms | 193.5727 Ops/s | 185.9528 Ops/s | $\color{#35bf28}+4.10\%$ |
| test_redq_speed[reduce-overhead-backward] | 14.3250ms | 12.5262ms | 79.8326 Ops/s | 75.7075 Ops/s | $\textbf{\color{#35bf28}+5.45\%}$ |
| test_redq_deprec_speed[False-None] | 14.6471ms | 12.7158ms | 78.6421 Ops/s | 72.6420 Ops/s | $\textbf{\color{#35bf28}+8.26\%}$ |
| test_redq_deprec_speed[False-backward] | 24.1502ms | 19.2491ms | 51.9505 Ops/s | 50.9666 Ops/s | $\color{#35bf28}+1.93\%$ |
| test_redq_deprec_speed[True-None] | 4.3854ms | 3.6193ms | 276.2948 Ops/s | 239.7321 Ops/s | $\textbf{\color{#35bf28}+15.25\%}$ |
| test_redq_deprec_speed[True-backward] | 9.4140ms | 8.7307ms | 114.5386 Ops/s | 111.8638 Ops/s | $\color{#35bf28}+2.39\%$ |
| test_redq_deprec_speed[reduce-overhead-None] | 4.3700ms | 3.6629ms | 273.0082 Ops/s | 257.5092 Ops/s | $\textbf{\color{#35bf28}+6.02\%}$ |
| test_redq_deprec_speed[reduce-overhead-backward] | 9.0375ms | 8.6653ms | 115.4029 Ops/s | 112.9005 Ops/s | $\color{#35bf28}+2.22\%$ |
| test_td3_speed[False-None] | 9.0324ms | 8.0738ms | 123.8575 Ops/s | 122.7668 Ops/s | $\color{#35bf28}+0.89\%$ |
| test_td3_speed[False-backward] | 11.9727ms | 11.0120ms | 90.8098 Ops/s | 92.6818 Ops/s | $\color{#d91a1a}-2.02\%$ |
| test_td3_speed[True-None] | 2.1940ms | 1.9716ms | 507.1993 Ops/s | 486.3631 Ops/s | $\color{#35bf28}+4.28\%$ |
| test_td3_speed[True-backward] | 3.9459ms | 3.6522ms | 273.8080 Ops/s | 243.5850 Ops/s | $\textbf{\color{#35bf28}+12.41\%}$ |
| test_td3_speed[reduce-overhead-None] | 2.1574ms | 1.9418ms | 514.9817 Ops/s | 482.1157 Ops/s | $\textbf{\color{#35bf28}+6.82\%}$ |
| test_td3_speed[reduce-overhead-backward] | 4.5544ms | 3.6512ms | 273.8853 Ops/s | 248.2584 Ops/s | $\textbf{\color{#35bf28}+10.32\%}$ |
| test_cql_speed[False-None] | 38.3751ms | 35.8332ms | 27.9071 Ops/s | 26.8558 Ops/s | $\color{#35bf28}+3.91\%$ |
| test_cql_speed[False-backward] | 53.1019ms | 46.7714ms | 21.3806 Ops/s | 20.6260 Ops/s | $\color{#35bf28}+3.66\%$ |
| test_cql_speed[True-None] | 16.9612ms | 16.1266ms | 62.0092 Ops/s | 60.5129 Ops/s | $\color{#35bf28}+2.47\%$ |
| test_cql_speed[True-backward] | 23.4694ms | 22.6762ms | 44.0990 Ops/s | 42.4800 Ops/s | $\color{#35bf28}+3.81\%$ |
| test_cql_speed[reduce-overhead-None] | 17.7559ms | 16.3969ms | 60.9870 Ops/s | 60.9423 Ops/s | $\color{#35bf28}+0.07\%$ |
| test_cql_speed[reduce-overhead-backward] | 23.6226ms | 22.7138ms | 44.0260 Ops/s | 42.8870 Ops/s | $\color{#35bf28}+2.66\%$ |
| test_a2c_speed[False-None] | 8.8933ms | 7.2794ms | 137.3742 Ops/s | 136.0055 Ops/s | $\color{#35bf28}+1.01\%$ |
| test_a2c_speed[False-backward] | 15.5955ms | 15.0450ms | 66.4674 Ops/s | 65.9701 Ops/s | $\color{#35bf28}+0.75\%$ |
| test_a2c_speed[True-None] | 3.6827ms | 3.3857ms | 295.3630 Ops/s | 288.3131 Ops/s | $\color{#35bf28}+2.45\%$ |
| test_a2c_speed[True-backward] | 11.4100ms | 10.3724ms | 96.4098 Ops/s | 93.5320 Ops/s | $\color{#35bf28}+3.08\%$ |
| test_a2c_speed[reduce-overhead-None] | 4.0046ms | 3.5124ms | 284.7077 Ops/s | 289.6168 Ops/s | $\color{#d91a1a}-1.70\%$ |
| test_a2c_speed[reduce-overhead-backward] | 11.4936ms | 10.6015ms | 94.3259 Ops/s | 96.0289 Ops/s | $\color{#d91a1a}-1.77\%$ |
| test_ppo_speed[False-None] | 8.7908ms | 7.7227ms | 129.4884 Ops/s | 131.5769 Ops/s | $\color{#d91a1a}-1.59\%$ |
| test_ppo_speed[False-backward] | 16.0688ms | 15.3393ms | 65.1922 Ops/s | 65.6001 Ops/s | $\color{#d91a1a}-0.62\%$ |
| test_ppo_speed[True-None] | 4.8604ms | 3.8091ms | 262.5313 Ops/s | 254.1618 Ops/s | $\color{#35bf28}+3.29\%$ |
| test_ppo_speed[True-backward] | 11.5644ms | 10.2323ms | 97.7293 Ops/s | 94.3466 Ops/s | $\color{#35bf28}+3.59\%$ |
| test_ppo_speed[reduce-overhead-None] | 7.7044ms | 3.8625ms | 258.8969 Ops/s | 259.5849 Ops/s | $\color{#d91a1a}-0.27\%$ |
| test_ppo_speed[reduce-overhead-backward] | 10.8065ms | 10.2820ms | 97.2569 Ops/s | 100.1028 Ops/s | $\color{#d91a1a}-2.84\%$ |
| test_reinforce_speed[False-None] | 7.3960ms | 6.6079ms | 151.3330 Ops/s | 147.6948 Ops/s | $\color{#35bf28}+2.46\%$ |
| test_reinforce_speed[False-backward] | 11.9463ms | 10.0493ms | 99.5089 Ops/s | 95.1375 Ops/s | $\color{#35bf28}+4.59\%$ |
| test_reinforce_speed[True-None] | 4.0317ms | 2.6867ms | 372.1973 Ops/s | 362.5389 Ops/s | $\color{#35bf28}+2.66\%$ |
| test_reinforce_speed[True-backward] | 9.4785ms | 9.0461ms | 110.5452 Ops/s | 106.3722 Ops/s | $\color{#35bf28}+3.92\%$ |
| test_reinforce_speed[reduce-overhead-None] | 3.7450ms | 2.6620ms | 375.6594 Ops/s | 346.2387 Ops/s | $\textbf{\color{#35bf28}+8.50\%}$ |
| test_reinforce_speed[reduce-overhead-backward] | 10.2407ms | 8.9624ms | 111.5778 Ops/s | 106.8318 Ops/s | $\color{#35bf28}+4.44\%$ |
| test_iql_speed[False-None] | 34.8184ms | 32.6298ms | 30.6468 Ops/s | 29.6465 Ops/s | $\color{#35bf28}+3.37\%$ |
| test_iql_speed[False-backward] | 54.0979ms | 45.8473ms | 21.8115 Ops/s | 21.5402 Ops/s | $\color{#35bf28}+1.26\%$ |
| test_iql_speed[True-None] | 14.9803ms | 13.8691ms | 72.1027 Ops/s | 70.7302 Ops/s | $\color{#35bf28}+1.94\%$ |
| test_iql_speed[True-backward] | 26.1419ms | 24.9348ms | 40.1046 Ops/s | 38.3257 Ops/s | $\color{#35bf28}+4.64\%$ |
| test_iql_speed[reduce-overhead-None] | 15.2671ms | 13.8889ms | 72.0000 Ops/s | 68.1353 Ops/s | $\textbf{\color{#35bf28}+5.67\%}$ |
| test_iql_speed[reduce-overhead-backward] | 26.5952ms | 25.2849ms | 39.5493 Ops/s | 38.5417 Ops/s | $\color{#35bf28}+2.61\%$ |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 7.7711ms | 5.3106ms | 188.3036 Ops/s | 182.0163 Ops/s | $\color{#35bf28}+3.45\%$ |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.5958ms | 0.4839ms | 2.0667 KOps/s | 1.9937 KOps/s | $\color{#35bf28}+3.66\%$ |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7044ms | 0.4596ms | 2.1757 KOps/s | 2.1001 KOps/s | $\color{#35bf28}+3.60\%$ |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.4186ms | 5.1420ms | 194.4762 Ops/s | 184.1845 Ops/s | $\textbf{\color{#35bf28}+5.59\%}$ |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8805ms | 0.4827ms | 2.0715 KOps/s | 2.0605 KOps/s | $\color{#35bf28}+0.53\%$ |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7658ms | 0.4591ms | 2.1783 KOps/s | 2.1324 KOps/s | $\color{#35bf28}+2.15\%$ |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.9642ms | 1.5938ms | 627.4256 Ops/s | 618.6275 Ops/s | $\color{#35bf28}+1.42\%$ |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.7039ms | 1.5001ms | 666.6037 Ops/s | 665.1734 Ops/s | $\color{#35bf28}+0.22\%$ |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.8573ms | 5.2258ms | 191.3566 Ops/s | 174.8291 Ops/s | $\textbf{\color{#35bf28}+9.45\%}$ |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.4029ms | 0.6153ms | 1.6251 KOps/s | 1.5778 KOps/s | $\color{#35bf28}+3.00\%$ |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9896ms | 0.5855ms | 1.7080 KOps/s | 1.6294 KOps/s | $\color{#35bf28}+4.82\%$ |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 8.1226ms | 5.2173ms | 191.6690 Ops/s | 181.1724 Ops/s | $\textbf{\color{#35bf28}+5.79\%}$ |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6695ms | 0.4864ms | 2.0560 KOps/s | 2.0245 KOps/s | $\color{#35bf28}+1.55\%$ |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7075ms | 0.4727ms | 2.1153 KOps/s | 2.0964 KOps/s | $\color{#35bf28}+0.90\%$ |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 8.3597ms | 5.3784ms | 185.9284 Ops/s | 180.6132 Ops/s | $\color{#35bf28}+2.94\%$ |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.1651ms | 0.4882ms | 2.0485 KOps/s | 2.0177 KOps/s | $\color{#35bf28}+1.53\%$ |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6571ms | 0.4629ms | 2.1605 KOps/s | 2.0871 KOps/s | $\color{#35bf28}+3.51\%$ |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.1595ms | 5.5848ms | 179.0564 Ops/s | 173.6516 Ops/s | $\color{#35bf28}+3.11\%$ |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.6672ms | 0.6127ms | 1.6322 KOps/s | 1.5693 KOps/s | $\color{#35bf28}+4.01\%$ |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8473ms | 0.5917ms | 1.6900 KOps/s | 1.6704 KOps/s | $\color{#35bf28}+1.17\%$ |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 5.7326ms | 4.2490ms | 235.3500 Ops/s | 220.8757 Ops/s | $\textbf{\color{#35bf28}+6.55\%}$ |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 18.7024ms | 13.4148ms | 74.5447 Ops/s | 75.7641 Ops/s | $\color{#d91a1a}-1.61\%$ |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 4.9355ms | 1.4056ms | 711.4226 Ops/s | 706.2216 Ops/s | $\color{#35bf28}+0.74\%$ |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.6052s | 16.4077ms | 60.9469 Ops/s | 224.7393 Ops/s | $\textbf{\color{#d91a1a}-72.88\%}$ |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 18.4656ms | 13.4145ms | 74.5460 Ops/s | 74.7526 Ops/s | $\color{#d91a1a}-0.28\%$ |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.4943ms | 1.5859ms | 630.5389 Ops/s | 692.8807 Ops/s | $\textbf{\color{#d91a1a}-9.00\%}$ |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 6.6933ms | 4.3755ms | 228.5456 Ops/s | 211.9799 Ops/s | $\textbf{\color{#35bf28}+7.81\%}$ |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 18.6266ms | 13.3728ms | 74.7785 Ops/s | 23.6903 Ops/s | $\textbf{\color{#35bf28}+215.65\%}$ |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.1477ms | 1.4726ms | 679.0542 Ops/s | 635.0628 Ops/s | $\textbf{\color{#35bf28}+6.93\%}$ |