OfflineRL
OfflineRL copied to clipboard
A collection of offline reinforcement learning algorithms.
Since COMBO is derived from CQL, I just wonder why the auto adjustment used in CQL is not discussed in COMBO? Is it useless in COMBO?
- This is for compatibility with newer versions of PyTorch, where `mode` is changed to be a class property of distributions. See https://github.com/pytorch/pytorch/pull/76690 for details.
 Usage of Session.__init__ is deprecated! Traceback (most recent call last): File "examples/train_d4rl.py", line 19, in fire.Fire(run_algo) File "/root/anaconda3/envs/off/lib/python3.7/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name)...
When I run the command python examples/train_task.py --algo_name=mopo --exp_name=halfcheetah --task HalfCheetah-v3 --task_data_type low --task_train_num 2 It shows : ``` File "examples/train_task.py", line 19, in fire.Fire(run_algo) File "/home/lksgcc/.pyenv/versions/anaconda3-5.0.1/envs/mujoco_py/lib/python3.8/site-packages/fire/core.py", line 141, in...
When COMBO is derived from CQL, why do they calculate CQL_loss differently?
环境搭建过程记录
# 创建激活环境 创建Conda 环境,这里取Python3.7,因为这是TensorFlow 1.X 的最后支持版本,之后的Python只能用TensorFlow 2.0之后的版本了。2.0 大改,很多老代码用不了。 ```bash conda create -n offline python=3.7 ``` conda 重新初始化一下。 ```bash conda init ``` 激活刚刚创建的环境 ```bash conda activate offline ``` # TensorFlow 和...
```bash $ python examples/train_task.py --algo_name=cql --exp_name=halfcheetah --task HalfCheetah-v3 --task_data_type low --task_train_num 100 2023-11-03 at 16:35:56.112 | INFO | Use cql algorithm! running build_ext 2023-11-03 at 16:35:56.618 | INFO | obs...
1. 建议setup.py的install_requires将'sklearn'换成'scikit-learn',只安装sklearn的话,在data.py的from sklearn.preprocessing import MinMaxScaler步骤会报错。 2. evaluation/neorl.py中的test_on_real_env函数中有判断:if "sp" or "sales" in env._name,但是d4rl环境似乎没有_name属性,会报错。 3. ray==1.2版本报错: `Traceback (most recent call last): File "/home/luofm/Utils/pyenv/offlinerl/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 323, in loop.run_until_complete(agent.run()) File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in...
Hi there, After getting everything installed, I run the script provided and meet the error: `File "/home/hsy/PycharmProjects/OfflineRL/offlinerl/utils/net/tanhpolicy.py", line 29, in __init__ self.mode = torch.tanh(normal_mean) AttributeError: can't set attribute.` Pycharm also...
When i run mopo , i find that the Cumulative rewards drop sharply. Why? Has this ever happened to you too? Thanks  