RL-Bitcoin-trading-bot Gaes fees

Hello, I was trying to work this out on my end from scratch, I have got it to the point of training the model and also visualize but it seems to drop in the middle of the training session without saving the model.

VC: Python : 3.8.10 tensorflow = 2.3.1 Windows = 11 No IDLE, Using script mode from windows power shell virtual env.

Below is the complete Traceback of the error I received.

2022-03-07 04:17:43.095316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-03-07 04:17:43.100610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] Traceback (most recent call last): File "RL-Bitcoin-trading-bot_7.py", line 501, in train_multiprocessing(CustomEnv, agent, train_df, train_df_nomalized, num_worker = 5, training_batch_size=50, visualize=True, EPISODES=5) File "D:\Mine\RLCurrent\multiprocessing_env.py", line 95, in train_multiprocessing a_loss, c_loss = agent.replay(states[worker_id], actions[worker_id], rewards[worker_id], predictions[worker_id], dones[worker_id], next_states[worker_id]) File "RL-Bitcoin-trading-bot_7.py", line 121, in replay advantages, target = self.get_gaes(rewards, dones, np.squeeze(values), np.squeeze(next_values)) File "RL-Bitcoin-trading-bot_7.py", line 93, in get_gaes deltas = [r + gamma * (1 - d) * nv - v for r, d, nv, v in zip(rewards, dones, next_values, values)] File "RL-Bitcoin-trading-bot_7.py", line 93, in deltas = [r + gamma * (1 - d) * nv - v for r, d, nv, v in zip(rewards, dones, next_values, values)] TypeError: unsupported operand type(s) for +: 'NoneType' and 'float'

Any sort of help is highly appreciated. If needed I'll post code snippets as well for more clarity. Thanks.

Mar 06 '22 22:03 crazypythonista

This is a duplicate of #18

Mar 06 '22 23:03 HoaxParagon

Also a duplicate of #9

Mar 07 '22 14:03 HoaxParagon

Hey I think the problem is originated from the output of critic_predict. I guess that in the original PPO function implemented by the writer has included "Critic model also watched the previous predicted value", but he removed it in this tutorial. That means critic model doesn't check previous value input now. Maybe you should try removing the np.zero input in critirc_predict function.

Mar 14 '22 07:03 wanga10000

RL-Bitcoin-trading-bot RL-Bitcoin-trading-bot copied to clipboard

Gaes fees

RL-Bitcoin-trading-bot
RL-Bitcoin-trading-bot copied to clipboard