davidireland3 issues

Results 8 issues of


davidireland3

getting suboptimal values for a binary integer programming problem

I have formulated a budgeted variant of max cut as a MIP based on the answer to [this question](https://or.stackexchange.com/a/7274/7471). I was initially using the Python MIP package, but it quickly...

Issue with critic target in PPO

In the [line used to define the returns](https://github.com/philtabor/Youtube-Code-Repository/blob/1ef76059bf55f7df9ccc09fce0e0bfb7c13e89bd/ReinforcementLearning/PolicyGradient/PPO/torch/ppo_torch.py#L186), we use the GAE + values as the target for the critic to learn. Is this correct? My intuition says no --...

Using Quickstart D4PG tutorial with a gpu

Is there any documentation anywhere? I'm trying to figure out how to run the Quickstart D4PG tutorial with a GPU, but I cannot find any mention anywhere of how to...

BBF update horizon in replay buffer looks to be miscalculated?

In this line [here](https://github.com/google-research/google-research/blob/513d75625c30a9080d8afdcc1ba1bde46c573d62/bigger_better_faster/bbf/replay_memory/subsequence_replay_buffer.py#L638), it looks as though they subtract one from the update horizon and use this to calculate the next state indices, but this is wrong. E.g. when...

davidireland3

getting suboptimal values for a binary integer programming problem

Issue with critic target in PPO

Using Quickstart D4PG tutorial with a gpu

BBF update horizon in replay buffer looks to be miscalculated?

Will support be added for new MuJoCo bindings?

InfoNCE errors

how can I find out information about which body part the observation belongs to?

dm-control mujoco tasks use A LOT of cpu's