skrl issues

Switch to normal distribution for infinite action spaces

1

Previous, the sampling failed if the action space is infinite. While an infinite action space might be bad practise, in some cases it is not easy to get rid of...

juhannc

Is there examples about using multiple Inputs observations?

1

Hi @Toni-SM , Are there examples of using multiple Inputs observations? For example, one input is an image and another input is a vector. Something similar in stablebaseline3 here https://stable-baselines3.readthedocs.io/en/master/guide/custom_policy.html#multiple-inputs-and-dictionary-observations

akux2021

Add Model-Based Meta-Policy-Optimization (MBMPO)

4

# Add Model-Based Meta-Policy-Optimization (MBMPO) ## Introduction and description Coming soon ## Improvements in this PR Coming soon ## Proof of Work Coming soon Cheers, Johann

juhannc

No module named omni.isaac.contrib_envs and omni.isaac.orbit_envs

### Description I am using the latest [orbit](https://github.com/NVIDIA-Omniverse/orbit/tree/83e14f096ed3b20223cdca3065975bcc7dfa22f1) with skrl1.1.0. And I am trying to run example codes provide in you docs(like [torch_ant_ppo.py](https://skrl.readthedocs.io/en/latest/_downloads/3faa6f6c7e33a77373e38111c8999c22/torch_ant_ppo.py) ), But I got No module named...

mrsbCN

bug

Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step Loop

This pull request addresses a discrepancy between the original TD3 and DDPG paper's algorithm and the current implementation in the repository. Specifically, the original implementation performs the sampling step outside...

alessandroassirelli98

A little bug on environment wrapper

### Discussed in https://github.com/Toni-SM/skrl/discussions/70 Originally posted by **403forbiddennn** April 19, 2023 In the **Isaac Gym wrapper** class, the `render` method is inappropriately overridden by your wrapper and thus can not...

Toni-SM

dm_control wrapper

1

### Description Hi there, I have been using `skrl` with OIGE, but when I try the "Getting Started" code for `dm_control` : ``` # import the environment wrapper and the...

elle-miller

bug

Random action only samples from the first action space dimension

### Description Random actions are done by taking the low and high values of the first dimension on the action space a,d then uniformly sampling from [low, high] for each...

nikolaradulov

bug

Mean rewards are not calculated properly

2

### Description The mean rewards are computed by adding the mean of all stored cumulative rewards to the self.tracking_data dictionary `self.tracking_data["Reward / Total reward (mean)"].append(np.mean(track_rewards))`. Then every time the data...

nikolaradulov

bug

Mixed double precision for PPO algorithm

# Mixed precision **Motivation**: Inspired by RLGames, we implemented automatic mixed double precision to boost performance of PPO. **Sources:** **Speed eval:** - Big neural network (units: \[2048, 1024, 1024, 512])...

lopatovsky

skrl
skrl copied to clipboard

Metadata

Switch to normal distribution for infinite action spaces

Is there examples about using multiple Inputs observations?

Add Model-Based Meta-Policy-Optimization (MBMPO)

No module named omni.isaac.contrib_envs and omni.isaac.orbit_envs

Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step Loop

A little bug on environment wrapper

dm_control wrapper

Random action only samples from the first action space dimension

Mean rewards are not calculated properly

Mixed double precision for PPO algorithm

← Metadata

Owner

Metadata

skrl skrl copied to clipboard

Metadata

← Metadata

Owner

Metadata

skrl
skrl copied to clipboard