sac issues

TD3 vs SAC

Hi, First, thanks for sharing the repo. I am really confused by the performance comparison between SAC and TD3. In TD3's results, TD3 beats SAC in every environment evaluated with...

HYDesmondLiu

DIAYN result reproduction & additional charts

Hi! I am currently trying to verify my DIAYN implementation and I was wondering if there are any additional results available that are not provided within the [original paper](https://arxiv.org/abs/1802.06070) or...

bozic-djordje

About markovian environments

4

Hi, thanks for the thorough implementation and making this code available, it really helps to understand the internal mechanisms of the SAC algorithm. I have a question regarding the code...

shanlior

maximization bias

1

Hello, I'm not sure whether this is an issue or not but I've been looking at your implementation for half an hour, and I think there might be a maximization...

mikelty

a mathematical problem ..

1

I derived Equation 12, but the result is not the same as Equation 13 in your paper. In my derivation, I didn't get the first item in Equation 13, I...

tyfeng1997

what is "sandbox"

2

Traceback (most recent call last): File "/home/xtq/sac/examples/mujoco_all_sac.py", line 15, in from sac.algos import SAC File "/home/xtq/sac/sac/algos/__init__.py", line 2, in from .diayn import DIAYN File "/home/xtq/sac/sac/algos/diayn.py", line 10, in from sac.policies.hierarchical_policy...

zienn

Docker Image won't create due to conda's environment.yml inconsistencies

when executing `docker-compose up` from my Macbook Pro, DockerFile would fail on `step 26/35` for non satisfied libs for `numpy=1.13.0` dep. (blas,mkl etc.) so I made some modifications to make...

rock-it-with-asher

SAC and cross-maze ant

2

Hi, Can SAC run with the cross-maze variation for ant? With the default parameters, the command: "python ./examples/mujoco_all_sac.py --env=ant --domain=ant --task=cross-maze --policy=gmm --log_dir=data/ant_cross-experiment" does not throw any errors but "Launches...

acohen13

could there be a problem with the initialization of GaussianPolicy ?

GaussianPolicy inherits from NNPolicy and in the end of GaussianPolicy constructor (__init__) there is a call to the parent of NNPolicy... shouldnt it be : `super(GaussianPolicy,self).__init__(env_spec)` I observed its repeating...

guyk1971

paper/code conflict: using minimum Q in policy gradient

1

The Soft Actor-Critic paper ([arXiv v2](https://arxiv.org/abs/1801.01290)) says, in the last paragraph on page 5: > We then use the minimum of the Q-functions for the value gradient in Equation 6...

jpreiss

sac
sac copied to clipboard

Metadata

TD3 vs SAC

DIAYN result reproduction & additional charts

About markovian environments

maximization bias

a mathematical problem ..

what is "sandbox"

Docker Image won't create due to conda's environment.yml inconsistencies

SAC and cross-maze ant

could there be a problem with the initialization of GaussianPolicy ?

paper/code conflict: using minimum Q in policy gradient

← Metadata

Owner

Metadata

sac sac copied to clipboard

Metadata

← Metadata

Owner

Metadata

sac
sac copied to clipboard