Super-mario-bros-PPO-pytorch
Super-mario-bros-PPO-pytorch copied to clipboard
Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
I downloaded the code, and ran the code sucessfully, but there is a problem with the output video as attachment. https://user-images.githubusercontent.com/62637488/180977645-a35737bc-ea35-4f68-82be-1fe98b06cf17.mp4 .
While testing the model i get this: Traceback (most recent call last): File "test.py", line 65, in test(opt) File "test.py", line 55, in test state, reward, done, info = env.step(action)...
While study your Mario PPO codes, https://github.com/uvipen/Super-mario-bros-PPO-pytorch/blob/master/train.py, it’s hard to understand the following codes: ################################################################################ values = torch.cat(values).detach() # torch.Size([4096]) states = torch.cat(states) gae = 0 R = [] for...
base ❯ pip install nes_py-8.1.2-cp37-cp37m-macosx_10_15_x86_64.whl Processing ./nes_py-8.1.2-cp37-cp37m-macosx_10_15_x86_64.whl Requirement already satisfied: numpy>=1.18.5 in /Users/etsiva/anaconda3/lib/python3.7/site-packages (from nes-py==8.1.2) (1.19.2) Requirement already satisfied: tqdm>=4.32.2 in /Users/etsiva/anaconda3/lib/python3.7/site-packages (from nes-py==8.1.2) (4.49.0) Collecting gym>=0.17.2 Using cached gym-0.18.0-py3-none-any.whl...

I learned to use your code on windows, but encountered the problem of oserror and eoferror. I checked it env.py I don't think the following paragraph is meaningful. Would you...
Hello, I'm curious about the logic of reward value design here, can you introduce it? https://github.com/uvipen/Super-mario-bros-PPO-pytorch/blob/9cd3fe4283331e9232088a19b03518fe94524a2f/src/env.py#L54 https://github.com/uvipen/Super-mario-bros-PPO-pytorch/blob/9cd3fe4283331e9232088a19b03518fe94524a2f/src/env.py#L61
I get a error RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 3.95 GiB total capacity; 2.69 GiB already allocated; 10.69 MiB free; 2.78 GiB reserved...
File "E:\anaconda\lib\multiprocessing\connection.py", line 170, in fileno self._check_closed() File "E:\anaconda\lib\multiprocessing\connection.py", line 136, in _check_closed raise OSError("handle is closed") OSError: handle is closed Traceback (most recent call last): File "", line 1,...
Hello, look at your code, feel some questions. It's on line 114 of train.py, i.e. values = torch.cat(values).detach(). I think this statement should come after line 123. In line 120,...