YonV1943 曾伊言

Results 40 comments of YonV1943 曾伊言

add `args.if_off_policy=False ` if you use on-policy algorithms (PPO, A2C) will fix this bug. add `args.if_off_policy=True` if you use other DRL algorithms. We add the a function to automatically set...

We have recently done some comparisons in these environments at MuJoCo and Isaac Gym. We will update the web with these results. ![MuJoCo Ant task in Isaac Gym (over 10...

We will update function `explore_env` and fix this bug. 1. cancel the `max_step` limit of trajectory 2. update the `class ReplayBuffer`

Perhaps the stochasticity is brought by the `env.reset()`. Because both stochastic and determinstic policy algorithms will use determinstic policy by default during the testing phase.

It is possoble to add metrics to TensorBoard: ``` with test_summary_writer.as_default(): tf.summary.scalar('loss', test_loss.result(), step=epoch) ``` ElegantRL use the following methods to print and plot the training logging metrics https://github.com/AI4Finance-Foundation/ElegantRL/issues/128#issuecomment-1079674617 You...

可以。 我们打算用DQN或者D3QN算法接一个 Atari Game 的打砖块环境。 这里也顺便说一下「利用elegantRL的**多进程特性**训练在Atari环境上的模型」要如何做: 先提供一个能跑Atari Game 的测试代码: ``` env = gym.make('某个Atari Game的名字') state = env.reset() done = True while done: state, reward, done, _ = env.step(action) ``` 因为多进程需要获取...

抱歉,我没有医学图像的数据,所以没有机会尝试这个方法在医学图像上的表现。 Sorry, I don't have any medical images for testing. So I have no chance to try this method on medical image in grayscale. 你想问的问题应该是【这个方法在三通道的RGB图像上可用,那么在单通道的灰度图上是否可用?】 我自己试了一下它在灰度图上的表现:远没有到达RGB图像的分割效果 I think your question...

有两个问题。 问题1,你使用的是单线程,而我的方法是多线程。 First, you use single thread, and my method in GitHub using Python 'multiprocessing'. 上面的代码所示的方法是单线程的,这是官网的写法,不是我提供的多线程版本。 我用Python 自带的 multiprocessing 实现了多线程。多线程比单线程快。 问题2,你没有展示你的网络利用率。 Second, you should check your network utilization. 此外,你能否查看 【摄像机】→【处理设备】...

> I have encountered the same problem. Have you solved it? > "Op has type float32 that does not match expected type of int32." When I convert the codes from...

The training pipeline needs to know about "the agent is off-policy or on-policy DRL algorithm". ```self.if_off_policy = agent.if_off_policy``` change this line to ``` name = agent_class.__name__ self.if_off_policy = all((name.find('PPO') ==...