CQL icon indicating copy to clipboard operation
CQL copied to clipboard

Antmaze results

Open dljzx opened this issue 4 years ago • 3 comments

dljzx avatar Jan 18 '22 09:01 dljzx

Thanks for you work on CQL. It really works on many enviornments, but in Antmaze environment it perform badly. Can you figure it out?

dljzx avatar Jan 18 '22 09:01 dljzx

Use the following hyperaparameters for Antmaze:

python -m SimpleSAC.conservative_sac_main \
    --env 'antmaze-medium-diverse-v2' \
    --cql.cql_min_q_weight=5.0 \
    --cql.cql_max_target_backup=True \
    --cql.cql_target_action_gap=0.2 \
    --orthogonal_init=True \
    --cql.cql_lagrange=True \
    --cql.cql_temp=1.0 \
    --cql.policy_lr=1e-4 \
    --cql.qf_lr=3e-4 \
    --cql.cql_clip_diff_min=-200 \
    --reward_scale=10.0 \
    --reward_bias=-5.0 \
    --policy_arch='256-256' \
    --qf_arch='256-256-256' \
    --policy_log_std_multiplier=0.0 \
    --eval_period=50 \
    --eval_n_trajs=100 \
    --n_epochs=1200 \
    --bc_epochs=40 \
    --logging.output_dir './experiment_output'

young-geng avatar Jan 18 '22 23:01 young-geng

Use the following hyperaparameters for Antmaze:

python -m SimpleSAC.conservative_sac_main \
    --env 'antmaze-medium-diverse-v2' \
    --cql.cql_min_q_weight=5.0 \
    --cql.cql_max_target_backup=True \
    --cql.cql_target_action_gap=0.2 \
    --orthogonal_init=True \
    --cql.cql_lagrange=True \
    --cql.cql_temp=1.0 \
    --cql.policy_lr=1e-4 \
    --cql.qf_lr=3e-4 \
    --cql.cql_clip_diff_min=-200 \
    --reward_scale=10.0 \
    --reward_bias=-5.0 \
    --policy_arch='256-256' \
    --qf_arch='256-256-256' \
    --policy_log_std_multiplier=0.0 \
    --eval_period=50 \
    --eval_n_trajs=100 \
    --n_epochs=1200 \
    --bc_epochs=40 \
    --logging.output_dir './experiment_output'

Thanks for your code update. It did work. By the way, in your code behavior cloning is used in the first 40 epochs, while this trick did not mentioned in the paper. So why is bc so important in antmaze environment? What if we do not use it?

dljzx avatar Jan 21 '22 11:01 dljzx