snake-ai
snake-ai copied to clipboard
可以test,无法训练,报错
(SnakeAI) E:\snake-ai-master\main>python train_cnn.py Using cuda device Wrapping the env in a VecTransposeImage. Process SpawnProcess-5: Traceback (most recent call last): File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\stable_baselines3\common\vec_env\subproc_vec_env.py", line 30, in _worker observation, reward, done, info = env.step(data) File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\stable_baselines3\common\monitor.py", line 95, in step observation, reward, done, info = self.env.step(action) File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\gym\core.py", line 289, in step return self.env.step(action) File "E:\snake-ai-master\main\snake_game_custom_wrapper_cnn.py", line 47, in step self.done, info = self.game.step(action) # info = {"snake_size": int, "snake_head_pos": np.array, "prev_snake_head_pos": np.array, "food_pos": np.array, "food_obtained": bool} File "E:\snake-ai-master\main\snake_game.py", line 96, in step self.sound_game_over.play() AttributeError: 'SnakeGame' object has no attribute 'sound_game_over' Traceback (most recent call last): File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\multiprocessing\connection.py", line 312, in _recv_bytes nread, err = ov.GetOverlappedResult(True) BrokenPipeError: [WinError 109] 管道已结束。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train_cnn.py", line 95, in
+1 求解决
播放声音前面加一个判断是否是silent_mode,训练的时候不需要播放声音
具体位置是:snake_game.py -- line 95 左右的位置
sound_game_over 不会影响我们训练模型,可以注释掉self.sound_game_over.play()再添加pass,等玩test时再打开
sound_game_over 不会影响我们训练模型,可以注释掉self.sound_game_over.play()再添加pass,等玩test时再打开
可以了,感谢兄弟
为什么我卡在这
mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了
在 2023-05-26 13:09:31,"wave" @.***> 写道:
为什么我卡在这
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
请问怎么让训练过程可视化呢
训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化
在 2023-05-31 02:01:11,"zjhcwjb" @.***> 写道:
mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
请问怎么让训练过程可视化呢
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊
你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染
在 2023-05-31 13:33:44,"zjhcwjb" @.***> 写道:
训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊
可以的话,共享你的代码
在 2023-05-31 14:34:13,"zjhcwjb" @.***> 写道:
你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
import os import sys import random import time
from stable_baselines3.common.monitor import Monitor from stable_baselines3.common.vec_env import SubprocVecEnv from stable_baselines3.common.callbacks import CheckpointCallback from sb3_contrib import MaskablePPO from sb3_contrib.common.wrappers import ActionMasker
from snake_game_custom_wrapper_mlp import SnakeEnv
NUM_ENV = 32 LOG_DIR = "logs" os.makedirs(LOG_DIR, exist_ok=True)
Linear scheduler
def linear_schedule(initial_value, final_value=0.0):
if isinstance(initial_value, str):
initial_value = float(initial_value)
final_value = float(final_value)
assert (initial_value > 0.0)
def scheduler(progress):
return final_value + progress * (initial_value - final_value)
return scheduler
def make_env(seed=0): def _init(): env = SnakeEnv(seed=seed) env = ActionMasker(env, SnakeEnv.get_action_mask) env = Monitor(env) env.seed(seed) return env return _init
def main():
# Generate a list of random seeds for each environment.
seed_set = set()
while len(seed_set) < NUM_ENV:
seed_set.add(random.randint(0, 1e9))
# Create the Snake environment.
env = SubprocVecEnv([make_env(seed=s) for s in seed_set])
lr_schedule = linear_schedule(2.5e-4, 2.5e-6)
clip_range_schedule = linear_schedule(0.15, 0.025)
# # Instantiate a PPO agent
model = MaskablePPO(
"MlpPolicy",
env,
device="cuda",
verbose=1,
n_steps=2048,
batch_size=512,
n_epochs=4,
gamma=0.94,
learning_rate=lr_schedule,
clip_range=clip_range_schedule,
tensorboard_log=LOG_DIR
)
# Set the save directory
save_dir = "trained_models_mlp"
os.makedirs(save_dir, exist_ok=True)
checkpoint_interval = 15625 # checkpoint_interval * num_envs = total_steps_per_checkpoint
checkpoint_callback = CheckpointCallback(save_freq=checkpoint_interval, save_path=save_dir, name_prefix="ppo_snake")
# Writing the training logs from stdout to a file
original_stdout = sys.stdout
log_file_path = os.path.join(save_dir, "training_log.txt")
with open(log_file_path, 'w') as log_file:
sys.stdout = log_file
model.learn(
total_timesteps=int(100000000),
callback=[checkpoint_callback]
)
env.close()
# Restore stdout
sys.stdout = original_stdout
# Save the final model
model.save(os.path.join(save_dir, "ppo_snake_final.zip"))
demo_env = make_env()()
with open(log_file_path, 'w') as log_file:
sys.stdout = log_file
for i in range(100):
model.learn(
total_timesteps=int(1000000),
callback=[checkpoint_callback]
)
obs = demo_env.reset()
demo_env.render()
time.sleep(0.5)
done = False
while not done:
action, _ = model.predict(obs)
obs, _, done, _ = demo_env.step(action)
demo_env.render()
time.sleep(0.5)
if name == "main": main() 嗯嗯 就是在train_mlp的基础上加上了渲染的部分 完全没有报错信息 但是画面完全不出来
这个格式,我实在不好看,你 可以保留格式再发我一份嘛
在 2023-05-31 14:41:25,"zjhcwjb" @.***> 写道:
可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
import os import sys import random import time
from stable_baselines3.common.monitor import Monitor from stable_baselines3.common.vec_env import SubprocVecEnv from stable_baselines3.common.callbacks import CheckpointCallback from sb3_contrib import MaskablePPO from sb3_contrib.common.wrappers import ActionMasker
from snake_game_custom_wrapper_mlp import SnakeEnv
NUM_ENV = 32 LOG_DIR = "logs" os.makedirs(LOG_DIR, exist_ok=True)
Linear scheduler
def linear_schedule(initial_value, final_value=0.0):
if isinstance(initial_value, str): initial_value = float(initial_value) final_value = float(final_value) assert (initial_value > 0.0)
def scheduler(progress): return final_value + progress * (initial_value - final_value)
return scheduler
def make_env(seed=0): def _init(): env = SnakeEnv(seed=seed) env = ActionMasker(env, SnakeEnv.get_action_mask) env = Monitor(env) env.seed(seed) return env return _init
def main():
Generate a list of random seeds for each environment.
seed_set = set() while len(seed_set) < NUM_ENV: seed_set.add(random.randint(0, 1e9))
Create the Snake environment.
env = SubprocVecEnv([make_env(seed=s) for s in seed_set])
lr_schedule = linear_schedule(2.5e-4, 2.5e-6) clip_range_schedule = linear_schedule(0.15, 0.025)
# Instantiate a PPO agent
model = MaskablePPO( "MlpPolicy", env, device="cuda", verbose=1, n_steps=2048, batch_size=512, n_epochs=4, gamma=0.94, learning_rate=lr_schedule, clip_range=clip_range_schedule, tensorboard_log=LOG_DIR )
Set the save directory
save_dir = "trained_models_mlp" os.makedirs(save_dir, exist_ok=True)
checkpoint_interval = 15625 # checkpoint_interval * num_envs = total_steps_per_checkpoint checkpoint_callback = CheckpointCallback(save_freq=checkpoint_interval, save_path=save_dir, name_prefix="ppo_snake")
Writing the training logs from stdout to a file
original_stdout = sys.stdout log_file_path = os.path.join(save_dir, "training_log.txt") with open(log_file_path, 'w') as log_file: sys.stdout = log_file
model.learn(
total_timesteps=int(100000000),
callback=[checkpoint_callback]
)
env.close()
Restore stdout
sys.stdout = original_stdout
Save the final model
model.save(os.path.join(save_dir, "ppo_snake_final.zip"))
demo_env = make_env()()
with open(log_file_path, 'w') as log_file: sys.stdout = log_file
for i in range(100):
model.learn(
total_timesteps=int(1000000),
callback=[checkpoint_callback]
)
obs = demo_env.reset()
demo_env.render()
time.sleep(0.5)
done = False
while not done:
action, _ = model.predict(obs)
obs, _, done, _ = demo_env.step(action)
demo_env.render()
time.sleep(0.5)
if name == "main": main() 嗯嗯 就是在train_mlp的基础上加上了渲染的部分 完全没有报错信息 但是画面完全不出来
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
您好 这样可以吗? 刚刚接触强化学习可能问的问题都比较蠢 实在打扰了 从 Windows 版邮件发送 发件人: shironghe发送时间: 2023年5月31日 14:44收件人: linyiLYi/snake-ai抄送: zjhcwjb; Comment主题: Re: [linyiLYi/snake-ai] 可以test,无法训练,报错 (Issue #2) 这个格式,我实在不好看,你 可以保留格式再发我一份嘛在 2023-05-31 14:41:25,"zjhcwjb" ***@***.***> 写道:可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>import osimport sysimport randomimport timefrom stable_baselines3.common.monitor import Monitorfrom stable_baselines3.common.vec_env import SubprocVecEnvfrom stable_baselines3.common.callbacks import CheckpointCallbackfrom sb3_contrib import MaskablePPOfrom sb3_contrib.common.wrappers import ActionMaskerfrom snake_game_custom_wrapper_mlp import SnakeEnvNUM_ENV = 32LOG_DIR = "logs"os.makedirs(LOG_DIR, exist_ok=True)Linear schedulerdef linear_schedule(initial_value, final_value=0.0):if isinstance(initial_value, str):initial_value = float(initial_value)final_value = float(final_value)assert (initial_value > 0.0)def scheduler(progress):return final_value + progress * (initial_value - final_value)return schedulerdef make_env(seed=0):def _init():env = SnakeEnv(seed=seed)env = ActionMasker(env, SnakeEnv.get_action_mask)env = Monitor(env)env.seed(seed)return envreturn _initdef main():# Generate a list of random seeds for each environment.seed_set = set()while len(seed_set) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>
哥们抱歉,刚刚我是在邮箱上看的代码,邮箱忽视了代码的格式,在github上看是没问题的。刚刚我又仔细看了一下train_mlp的代码,发现模型的整个训练过程都是在MaskablePPO内部进行的,我未在其API中找到调用env.render的参数,我想是无法展示训练的画面的。但你可以想象其训练画面不过是重复n次贪吃蛇游戏,不断的eat食物获取奖励,不断的死亡获取惩罚。
在 2023-05-31 15:04:14,"zjhcwjb" @.***> 写道:
您好 这样可以吗? 刚刚接触强化学习可能问的问题都比较蠢 实在打扰了 从 Windows 版邮件发送 发件人: shironghe发送时间: 2023年5月31日 14:44收件人: linyiLYi/snake-ai抄送: zjhcwjb; Comment主题: Re: [linyiLYi/snake-ai] 可以test,无法训练,报错 (Issue #2) 这个格式,我实在不好看,你 可以保留格式再发我一份嘛在 2023-05-31 14:41:25,"zjhcwjb" ***@***.***> 写道:可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>import osimport sysimport randomimport timefrom stable_baselines3.common.monitor import Monitorfrom stable_baselines3.common.vec_env import SubprocVecEnvfrom stable_baselines3.common.callbacks import CheckpointCallbackfrom sb3_contrib import MaskablePPOfrom sb3_contrib.common.wrappers import ActionMaskerfrom snake_game_custom_wrapper_mlp import SnakeEnvNUM_ENV = 32LOG_DIR = "logs"os.makedirs(LOG_DIR, exist_ok=True)Linear schedulerdef linear_schedule(initial_value, final_value=0.0):if isinstance(initial_value, str):initial_value = float(initial_value)final_value = float(final_value)assert (initial_value > 0.0)def scheduler(progress):return final_value + progress * (initial_value - final_value)return schedulerdef make_env(seed=0):def _init():env = SnakeEnv(seed=seed)env = ActionMasker(env, SnakeEnv.get_action_mask)env = Monitor(env)env.seed(seed)return envreturn _initdef main():# Generate a list of random seeds for each environment.seed_set = set()while len(seed_set) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
嗯嗯 刚接触强化学习但是还是想显示出来确认是不是真的在训练 train_cnn中是有调用env.render的 不知道为什么也显示不出来 问chatgpt也不知道怎么改 从 Windows 版邮件发送 发件人: shironghe发送时间: 2023年5月31日 15:13收件人: linyiLYi/snake-ai抄送: zjhcwjb; Comment主题: Re: [linyiLYi/snake-ai] 可以test,无法训练,报错 (Issue #2) 哥们抱歉,刚刚我是在邮箱上看的代码,邮箱忽视了代码的格式,在github上看是没问题的。刚刚我又仔细看了一下train_mlp的代码,发现模型的整个训练过程都是在MaskablePPO内部进行的,我未在其API中找到调用env.render的参数,我想是无法展示训练的画面的。但你可以想象其训练画面不过是重复n次贪吃蛇游戏,不断的eat食物获取奖励,不断的死亡获取惩罚。在 2023-05-31 15:04:14,"zjhcwjb" ***@***.***> 写道: 您好 这样可以吗? 刚刚接触强化学习可能问的问题都比较蠢 实在打扰了 从 Windows 版邮件发送 发件人: shironghe发送时间: 2023年5月31日 14:44收件人: linyiLYi/snake-ai抄送: zjhcwjb; Comment主题: Re: [linyiLYi/snake-ai] 可以test,无法训练,报错 (Issue #2) 这个格式,我实在不好看,你 可以保留格式再发我一份嘛在 2023-05-31 14:41:25,"zjhcwjb" ***@***.***> 写道:可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>import osimport sysimport randomimport timefrom stable_baselines3.common.monitor import Monitorfrom stable_baselines3.common.vec_env import SubprocVecEnvfrom stable_baselines3.common.callbacks import CheckpointCallbackfrom sb3_contrib import MaskablePPOfrom sb3_contrib.common.wrappers import ActionMaskerfrom snake_game_custom_wrapper_mlp import SnakeEnvNUM_ENV = 32LOG_DIR = "logs"os.makedirs(LOG_DIR, exist_ok=True)Linear schedulerdef linear_schedule(initial_value, final_value=0.0):if isinstance(initial_value, str):initial_value = float(initial_value)final_value = float(final_value)assert (initial_value > 0.0)def scheduler(progress):return final_value + progress * (initial_value - final_value)return schedulerdef make_env(seed=0):def _init():env = SnakeEnv(seed=seed)env = ActionMasker(env, SnakeEnv.get_action_mask)env = Monitor(env)env.seed(seed)return envreturn _initdef main():# Generate a list of random seeds for each environment.seed_set = set()while len(seed_set) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>