davidireland3

Results 8 issues of davidireland3

I have formulated a budgeted variant of max cut as a MIP based on the answer to [this question](https://or.stackexchange.com/a/7274/7471). I was initially using the Python MIP package, but it quickly...

In the [line used to define the returns](https://github.com/philtabor/Youtube-Code-Repository/blob/1ef76059bf55f7df9ccc09fce0e0bfb7c13e89bd/ReinforcementLearning/PolicyGradient/PPO/torch/ppo_torch.py#L186), we use the GAE + values as the target for the critic to learn. Is this correct? My intuition says no --...

Is there any documentation anywhere? I'm trying to figure out how to run the Quickstart D4PG tutorial with a GPU, but I cannot find any mention anywhere of how to...

In this line [here](https://github.com/google-research/google-research/blob/513d75625c30a9080d8afdcc1ba1bde46c573d62/bigger_better_faster/bbf/replay_memory/subsequence_replay_buffer.py#L638), it looks as though they subtract one from the update horizon and use this to calculate the next state indices, but this is wrong. E.g. when...

title: will compatibility be added so we can use D4RL with the new MuJoCo (i.e. just being able to `pip install mujoco`).

I believe there are some issues with the InfoNCE loss. After stepping through the code, the denominator is calculated only for negatives in the batch (it should be similar to...

similar to the OpenAI mujoco tasks where they have documentation detailing which dimension of the state corresponds to which part of the physical body, it would be good to be...

[Potentially related to this post](https://github.com/google-deepmind/mujoco/issues/203). I was wondering why my code slows down when running a few seeds at a time. When I looked, the CPU's are being throttled by...