minerl TimeoutError: timed out

Hi, how can I solve this timed out error? I'm executing my python code with nohup command.

Here is the log:

2023-07-19 21:15:15.275057: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-07-19 21:15:15.840035: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT /home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/stable_baselines3/common/vec_env/patch_gym.py:49: UserWarning: You provided an OpenAI Gym environment. We strongly recommend transitioning to Gymnasium environments. Stable-Baselines3 is automatically wrapping your environments in a compatibility layer, which could potentially cause issues. warnings.warn( :128: RuntimeWarning: 'minerl.utils.process_watcher' found in sys.modules after import of package 'minerl.utils', but prior to execution of 'minerl.utils.process_watcher'; this may result in unpredictable behaviour CUDA disponible: True Utilizando la NVIDIA TITAN Xp Environment created!

Using cuda device Wrapping the env with a Monitor wrapper Wrapping the env in a DummyVecEnv. Wrapping the env in a VecTransposeImage. Model created!

Starting learning!

Logging to ./ppo_minerl_tensorboard/PPO_1 Traceback (most recent call last): File "/home/gti/TFG/PPO.py", line 79, in model.learn(total_timesteps=10000000) File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/stable_baselines3/ppo/ppo.py", line 308, in learn return super().learn( ^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 259, in learn continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 178, in collect_rollouts new_obs, rewards, dones, infos = env.step(clipped_actions) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 188, in step return self.step_wait() ^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/stable_baselines3/common/vec_env/vec_transpose.py", line 95, in step_wait observations, rewards, dones, infos = self.venv.step_wait() ^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 70, in step_wait obs, self.reset_infos[env_idx] = self.envs[env_idx].reset() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/stable_baselines3/common/monitor.py", line 83, in reset return self.env.reset(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/shimmy/openai_gym_compatibility.py", line 244, in reset return self.gym_env.reset(), {} ^^^^^^^^^^^^^^^^^^^^ File "/home/gti/TFG/PPO.py", line 57, in reset obs = self.env.reset(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/gym/wrappers/time_limit.py", line 27, in reset return self.env.reset(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/minerl/herobraine/env_specs/basalt_specs.py", line 78, in reset return self.env.reset() ^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/minerl/herobraine/env_specs/basalt_specs.py", line 57, in reset return super().reset() ^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/gym/core.py", line 251, in reset return self.env.reset(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/minerl/env/_singleagent.py", line 22, in reset multi_obs = super().reset() ^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/minerl/env/_multiagent.py", line 446, in reset self._send_mission(self.instances[0], agent_xmls[0], self._get_token(0, ep_uid)) # Master ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/minerl/env/_multiagent.py", line 605, in _send_mission reply = comms.recv_message(instance.client_socket) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/minerl/env/comms.py", line 63, in recv_message lengthbuf = recvall(sock, 4) ^^^^^^^^^^^^^^^^ File "/home/gti/miniconda3/envs/tensor-env/lib/python3.11/site-packages/minerl/env/comms.py", line 73, in recvall newbuf = sock.recv(count) ^^^^^^^^^^^^^^^^ TimeoutError: timed out Attempted to send kill command to minecraft process and failed with exception timed out

Jul 19 '23 19:07 Sanfee18

Enable full logging (see example in minerl.io/docs regarding logging library). This will provide more info what went wrong with MineRL. Make sure you have valid display attached or you run your code with xvfb-run -a python ....

Jul 20 '23 23:07 Miffyli

Hi Miffyil,

I also am facing a similar problem. I am attempting to train an agent and at around the 40th episode, with 2000 steps per an episode I receive a socket timeout error. I've attached the error message below. It appears that I am running out of memory. I would greatly appreciate your insights and suggestions on how to overcome this challenge.

Environment Details:

Operating System: Windows 11 Terminal Java Version: java version "1.8.0_333" Terminal Java Compiler Version: javac javac 1.8.0_333 Python Version in Conda Environment: Python 3.9.17 Memory: 16 GB CPU: Intel i7-13th Gen GPU: Nividia Geforce RTX 4050

Steps Taken:

Decreasing the complexity of my script
Moved most procceses on to my gpu
Restart my PC

Despite these attempts, the problem still persists, and I'm unsure about how to proceed. Any guidance or additional information you could provide would be immensely helpful.

Thank you for your time and help. Please let me know if any further information is needed.

Timeout.txt

Jul 23 '23 19:07 Patrickjliu

MineRL is known to leak bit of memory (sometimes it is not a problem, sometimes it is). The best remedy is to reboot the environment every now and then. I also wrap all reset and step calls around try-except, and reboot the environment if error is encountered.

Jul 24 '23 18:07 Miffyli

I've tried the answer you told me and I'm still facing timed out error always at around 130k steps, from a 5M steps code.

Always the timed out error is present, I face this other error: "only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices". Based on my code, it's coming from getting the POV out of the obs dictionary, meaning that maybe the obs is not being created right.

If you want to see the last logs:

[01:59:43] [Render thread/INFO]: Environment: authHost='https://authserver.mojang.com', accountsHost='https://api.mojang.com', sessionHost='https://sessionserver.mojang.com', servicesHost='https://api.minecraftservices.com', name='PROD' 2284 DEBUG:minerl.env.malmo.instance.84c473:[01:59:43] [Render thread/INFO]: Starting integrated minecraft server version 1.16.5 2285 DEBUG:minerl.env.malmo.instance.84c473:[01:59:43] [Render thread/INFO]: Generating keypair 2286 DEBUG:minerl.env.malmo.instance.84c473:[01:59:49] [Render thread/INFO]: Preparing start region for dimension minecraft:overworld 2287 DEBUG:minerl.env.malmo.instance.84c473:[02:00:02] [Render thread/INFO]: Changing view distance to 11, from 10 2288 DEBUG:minerl.env.malmo.instance.84c473:[02:00:02] [Render thread/INFO]: MineRLAgent0[local:E:8df182a0] logged in with entity id 159 at (-509.5, 67.0, -773.5) 2289 DEBUG:minerl.env.malmo.instance.84c473:[02:00:02] [Render thread/INFO]: MineRLAgent0 joined the game 2290 DEBUG:minerl.env.malmo.instance.84c473:[02:00:02] [Render thread/INFO]: Preparing spawn area: 0% 2291 DEBUG:minerl.env.malmo.instance.84c473:[02:00:02] [Render thread/INFO]: Time elapsed: 5 ms 2292 DEBUG:minerl.env.malmo.instance.84c473:[02:00:02] [Render thread/INFO]: [STDOUT]: Starting new video null 2293 DEBUG:minerl.env.malmo.instance.84c473:[02:00:02] [Render thread/INFO]: Saving and pausing game... 2294 DEBUG:minerl.env.malmo.instance.84c473:[02:00:02] [Render thread/INFO]: Saving chunks for level 'ServerLevel[mcpworlde9ccc189cdfd]'/minecraft:overworld 2295 DEBUG:minerl.env.malmo.instance.84c473:[02:00:03] [Render thread/INFO]: Saving chunks for level 'ServerLevel[mcpworlde9ccc189cdfd]'/minecraft:the_nether 2296 DEBUG:minerl.env.malmo.instance.84c473:[02:00:03] [Render thread/INFO]: Saving chunks for level 'ServerLevel[mcpworlde9ccc189cdfd]'/minecraft:the_end 2297 DEBUG:minerl.env.malmo.instance.84c473:[02:00:03] [Render thread/INFO]: Loaded 0 advancements 2298 DEBUG:minerl.env._multiagent:Peeking the clients. 2299 DEBUG:minerl.env._multiagent:Closing MineRL env... 2300 Encountered exception during reset: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices. Recreating environment. 2301 DEBUG:minerl.env.malmo.instance.84c473:[02:00:06] [EnvServerSocketHandler/INFO]: [STDOUT]: *** Stopping the replay, returning control to the inputs 2302 INFO:process_watcher:About to reap process tree of 352313:launchClient.sh: i zombie, owner 337663, printing process tree status in termination order: 2303 INFO:process_watcher: -352313:launchClient.sh: i zombie, owner 337663 2304 INFO:process_watcher:Trying to SIGTERM 352313:launchClient.sh: i zombie, owner 337663 2305 INFO:process_watcher:Process psutil.Popen(pid=352313, name='launchClient.sh', status='terminated', exitcode=0, started='01:59:14') terminated with exit code 0 2306 Traceback (most recent call last): 2307 File "/home/dsaneng/simple.py", line 59, in step 2308 self.reset() 2309 File "/home/dsaneng/simple.py", line 79, in reset 2310 raise RuntimeError("Too many exceptions during reset, creating environment, giving up") 2311 RuntimeError: Too many exceptions during reset, creating environment, giving up 2312 During handling of the above exception, another exception occurred: 2313 Traceback (most recent call last): 2314 File "/home/dsaneng/simple.py", line 158, in 2315 model.learn(total_timesteps=config["total_timesteps"], callback=WandbCallback()) 2316 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/stable_baselines3/ppo/ppo.py", line 308, in learn 2317 return super().learn( 2318 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 259, in learn 2319 continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps) 2320 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 178, in collect_rollouts 2321 new_obs, rewards, dones, infos = env.step(clipped_actions) 2322 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 197, in step 2323 return self.step_wait() 2324 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/stable_baselines3/common/vec_env/vec_transpose.py", line 95, in step_wait 2325 observations, rewards, dones, infos = self.venv.step_wait() 2326 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 58, in step_wait 2327 obs, self.buf_rews[env_idx], terminated, truncated, self.buf_infos[env_idx] = self.envs[env_idx].step( 2328 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/stable_baselines3/common/monitor.py", line 94, in step 2329 observation, reward, terminated, truncated, info = self.env.step(action) 2330 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/shimmy/openai_gym_compatibility.py", line 255, in step 2331 obs, reward, done, info = self.gym_env.step(action) 2332 File "/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/gym/wrappers/monitor.py", line 46, in step 2333 observation, reward, done, info = self.env.step(action) 2334 File "/home/dsaneng/simple.py", line 63, in step 2335 self.reset() 2336 File "/home/dsaneng/simple.py", line 79, in reset 2337 raise RuntimeError("Too many exceptions during reset, creating environment, giving up") 2338 RuntimeError: Too many exceptions during reset, creating environment, giving up 2339 ERROR:minerl.env.malmo:Attempted to send kill command to minecraft process and failed with exception timed out 2340 INFO:process_watcher:About to reap process tree of 337830:launchClient.sh:/usr/bin/bash i sleeping, owner 337663, printing process tree status in termination order: 2341 INFO:process_watcher: -337834:java:/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java i sleeping, owner 337830 2342 INFO:process_watcher: -337830:launchClient.sh:/usr/bin/bash i sleeping, owner 337663 2343 INFO:process_watcher:Trying to SIGTERM 337834:java:/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java i sleeping, owner 337830 2344 INFO:process_watcher:Process 337834 survived SIGTERM; trying SIGKILL on 337834:java:/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java i sleeping, owner 337830 2345 DEBUG:minerl.env.malmo.instance.da4bd2:/home/dsaneng/.conda/envs/rl/lib/python3.10/site-packages/minerl/env/../MCP-Reborn/launchClient.sh: line 52: 337834 Killed java -Xmx$maxMem -jar $fatjar --envPort=$port 2346 INFO:process_watcher:Process psutil.Process(pid=337834, name='java', status='terminated', started='23:19:45') terminated with exit code None 2347 INFO:process_watcher:Trying to SIGTERM 337830:launchClient.sh:/usr/bin/bash i zombie, owner 337663 2348 INFO:process_watcher:Process psutil.Popen(pid=337830, name='launchClient.sh', status='terminated', exitcode=0, started='23:19:45') terminated with exit code 0

Aug 05 '23 09:08 Sanfee18

By the way, I'm executing my program on a GPU too, as @Patrickjliu said. Maybe it has to do with that?

Aug 05 '23 09:08 Sanfee18

The error is bit confusing indeed, but still tied to the environment crashing; the Python code expects some proper replies but gets empty buffers, and thus crashes like this.

I still think wrapping with step and reset with try-except should work, and if something is risen, delete the environment and recreate. What is the code logic looking like? Note that resetting the environment after crash might also fail for a moment (e.g., some process still hangs around), so you might need to keep try-except resetting the environment until it works.

Running MineRL on GPU (and your GPU code) should not really affect things much unless you really run out of VRAM completely (might want to check that). If there was some hard conflict, you would not be able to run the code in the first place.

Aug 05 '23 12:08 Miffyli

Could you provide me some example code to see what you actually mean by deleting and creating a new environment?

I think I've done that properly and still got the error.

Right now I've decided to reset the environment every 50k timesteps, once the episode has ended, to see if that helps, being that it always fails at around 130k.

Aug 05 '23 12:08 Sanfee18

The usual

env.close()
env = gym.create("MineRLBasaltFindCave-v0")

Should be enough.

I would still add try-except checks around step and reset in addition to your regular resetting just in case. If you systematically always get a crash at specific steps, you might also want to check the memory use of the machine if it is growing as training progresses (or if VRAM use increases).

Aug 05 '23 13:08 Miffyli

I've tried what you suggested and I'm still getting the same error: 1736 File "/usr/local/lib/python3.8/dist-packages/stable_baselines3/common/on_policy_algorithm.py", line 259, in learn 1737 continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps) 1738 File "/usr/local/lib/python3.8/dist-packages/stable_baselines3/common/on_policy_algorithm.py", line 178, in collect_rollouts 1739 new_obs, rewards, dones, infos = env.step(clipped_actions) 1740 File "/usr/local/lib/python3.8/dist-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 197, in step 1741 return self.step_wait() 1742 File "/usr/local/lib/python3.8/dist-packages/stable_baselines3/common/vec_env/vec_transpose.py", line 95, in step_wait 1743 observations, rewards, dones, infos = self.venv.step_wait() 1744 File "/usr/local/lib/python3.8/dist-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 70, in step_wait 1745 obs, self.reset_infos[env_idx] = self.envs[env_idx].reset() 1746 File "/usr/local/lib/python3.8/dist-packages/stable_baselines3/common/monitor.py", line 83, in reset 1747 return self.env.reset(**kwargs) 1748 File "/usr/local/lib/python3.8/dist-packages/shimmy/openai_gym_compatibility.py", line 244, in reset 1749 return self.gym_env.reset(), {} 1750 File "/usr/local/lib/python3.8/dist-packages/gym/wrappers/monitor.py", line 53, in reset 1751 observation = self.env.reset(**kwargs) 1752 File "PPO.py", line 70, in reset 1753 obs = self.env.reset(**kwargs) 1754 File "/usr/local/lib/python3.8/dist-packages/gym/wrappers/time_limit.py", line 27, in reset 1755 return self.env.reset(**kwargs) 1756 File "/usr/local/lib/python3.8/dist-packages/minerl/herobraine/env_specs/basalt_specs.py", line 78, in reset 1757 return self.env.reset() 1758 File "/usr/local/lib/python3.8/dist-packages/minerl/herobraine/env_specs/basalt_specs.py", line 57, in reset 1759 return super().reset() 1760 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 251, in reset 1761 return self.env.reset(**kwargs) 1762 File "/usr/local/lib/python3.8/dist-packages/minerl/env/_singleagent.py", line 22, in reset 1763 multi_obs = super().reset() 1764 File "/usr/local/lib/python3.8/dist-packages/minerl/env/_multiagent.py", line 446, in reset 1765 self._send_mission(self.instances[0], agent_xmls[0], self._get_token(0, ep_uid)) # Master 1766 File "/usr/local/lib/python3.8/dist-packages/minerl/env/_multiagent.py", line 605, in _send_mission 1767 reply = comms.recv_message(instance.client_socket) 1768 File "/usr/local/lib/python3.8/dist-packages/minerl/env/comms.py", line 63, in recv_message 1769 lengthbuf = recvall(sock, 4) 1770 File "/usr/local/lib/python3.8/dist-packages/minerl/env/comms.py", line 73, in recvall 1771 newbuf = sock.recv(count) 1772 socket.timeout: timed out 1773 ERROR:minerl.env.malmo:Attempted to send kill command to minecraft process and failed with exception timed out 1774 INFO:process_watcher:About to reap process tree of 115:launchClient.sh:/usr/bin/bash i sleeping, owner 44, printing process tree status in termination order: 1775 INFO:process_watcher: -118:java:/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java i sleeping, owner 115 1776 INFO:process_watcher: -115:launchClient.sh:/usr/bin/bash i sleeping, owner 44 1777 INFO:process_watcher:Trying to SIGTERM 118:java:/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java i sleeping, owner 115 1778 INFO:process_watcher:Process 118 survived SIGTERM; trying SIGKILL on 118:java:/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java i sleeping, owner 115 1779 DEBUG:minerl.env.malmo.instance.59a811:/usr/local/lib/python3.8/dist-packages/minerl/env/../MCP-Reborn/launchClient.sh: line 52: 118 Killed java -Xmx$maxMem -jar $fatjar --envPort=$port 1780 INFO:process_watcher:Process psutil.Process(pid=118, name='java', status='terminated', started='16:40:35') terminated with exit code None 1781 INFO:process_watcher:Trying to SIGTERM 115:launchClient.sh:/usr/bin/bash i zombie, owner 44 1782 INFO:process_watcher:Process psutil.Popen(pid=115, name='launchClient.sh', status='terminated', exitcode=0, started='16:40:35') terminated with exit code 0.

Here you can see my code:

import gym
from time import sleep
from gym.spaces import Discrete
from gym.wrappers import Monitor

from stable_baselines3 import PPO
from stable_baselines3.common.callbacks import BaseCallback

import minerl
import wandb

import logging
logging.basicConfig(level=logging.DEBUG)

config = {
    "policy_type": "CnnPolicy",
    "total_timesteps": 5000000,
    "env_name": "MineRLBasaltFindCave-v0",
    "run_name": "PPO_MineRLBasaltFindCave-v0"
}

run = wandb.init(
    project="MineRL",
    entity="minerl_tfg",
    name=config["run_name"],
    config=config,
    sync_tensorboard=True, 
    monitor_gym=True,
    save_code=True,  
)

def make_env():
    env = gym.make(config["env_name"])
    env = BitMaskWrapper(env)  # Apply BitMaskWrapper first
    env = Monitor(env, directory="monitor_results", force=True)  # Then apply Monitor
    print("Nuevo entorno creado!!!")
    return env

class BitMaskWrapper(gym.Wrapper):
    def __init__(self, env):
        super(BitMaskWrapper, self).__init__(env)
        self.orig_action_space = self.action_space
        self.action_space = gym.spaces.Discrete(32)  # Modify the action space to Discrete(32)
        self.observation_space = self.observation_space['pov']
        self.noop_action = self.orig_action_space.noop()  # Pre-calculate no-op action

    
    def step(self, action):
        while True:  # Keep trying to step until successful
            try:
                assert 0 <= action < 64, "Invalid action"
                masked_action = self._apply_bit_mask(action)
                obs, reward, done, info = self.env.step(masked_action)
                if info:  # Print info only if it's not empty
                    print("Info dictionary:", info)
                obs = obs["pov"]
                obs = obs / 255.0
                return obs, reward, done, info
            except TimeoutError:
                print("Ha ocurrido un TimeoutError. Intentando volver a crear el entorno...\n")
                self.env.close()
                self.env = make_env()  # Recreate the environment
                sleep(1)  # Adding a delay to ensure proper cleanup

    def reset(self, **kwargs):
        while True:  # Keep trying to reset until successful
            try:
                obs = self.env.reset(**kwargs)
                obs = obs["pov"]
                obs = obs / 255.0
                return obs
            except TimeoutError:
                print("Ha ocurrido un TimeoutError durante el reset. Intentando volver a crear el entorno...")
                self.env.close()
                self.env = make_env()  # Recreate the environment
                sleep(1)  # Adding a delay to ensure proper cleanup


    def _apply_bit_mask(self, action):
        """Applies the bit mask to the action."""

        back_m = action & 1
        forward_m = (action >> 1) & 1
        left_m = (action >> 2) & 1
        right_m = (action >> 3) & 1
        sprint_m = (action >> 4) & 1

        action = self.noop_action.copy()

        action['sprint'] = sprint_m
        action['right'] = right_m
        action['left'] = left_m
        action['forward'] = forward_m
        action['back'] = back_m

        return action

    def get_action_meanings(self):
        # Override this method to reflect the modified action space
        return [str(i) for i in range(self.action_space.n)]

    def render(self, mode='human', **kwargs):
        # Override the render method if necessary
        return self.env.render(mode, **kwargs)

    def seed(self, seed=None):
        # Forward the seed call to the wrapped environment
        return self.env.seed(seed)


class WandbCallback(BaseCallback):
    def __init__(self, verbose=0):
        super(WandbCallback, self).__init__(verbose)
        self.last_logged_episode = -1

    def _on_rollout_end(self):
        env = self.training_env.envs[0].env

        rewards = env.get_episode_rewards()
        lengths = env.get_episode_lengths()

        if len(rewards) > self.last_logged_episode + 1:
            mean_reward = sum(rewards[self.last_logged_episode+1:]) / len(rewards[self.last_logged_episode+1:])
            mean_length = sum(lengths[self.last_logged_episode+1:]) / len(lengths[self.last_logged_episode+1:])

            total_timesteps = env.get_total_steps()  # Retrieve total steps from the Monitor wrapper
            
            wandb.log({
                'mean_reward': mean_reward,
                'mean_episode_length': mean_length,
                'total_timesteps': total_timesteps
            })

            self.last_logged_episode = len(rewards) - 1

    def _on_step(self):
        return True

# Create the BitMaskWrapper around the MineRL environment
env = make_env()
print("Environment created!\n")

# Create your model (e.g., PPO)
model = PPO(config["policy_type"], env, verbose=0, device="cuda")
print("PPO model created!\n")

# Train your model with the callback
model.learn(total_timesteps=config["total_timesteps"], callback=WandbCallback())`

Aug 06 '23 23:08 Sanfee18

I haven't been able to solve this.

I've changed the code so that it catches every exception and tries to recreate the environment. Now, after the same 120-130k timesteps I get this error over an over, even reseting the environment as done on the ealier comment.

Error: 1305 ---------------------------------------- 1306 | rollout/ | | 1307 | ep_len_mean | 3.11e+03 | 1308 | ep_rew_mean | 0 | 1309 | time/ | | 1310 | fps | 17 | 1311 | iterations | 62 | 1312 | time_elapsed | 7087 | 1313 | total_timesteps | 126976 | 1314 | train/ | | 1315 | approx_kl | 0.24600112 | 1316 | clip_fraction | 0.726 | 1317 | clip_range | 0.2 | 1318 | entropy_loss | -3.34 | 1319 | explained_variance | -1.13 | 1320 | learning_rate | 0.0003 | 1321 | loss | -0.152 | 1322 | n_updates | 610 | 1323 | policy_gradient_loss | -0.117 | 1324 | value_loss | 0.00017 | 1325 ---------------------------------------- 1326 Info dictionary: {'TimeLimit.truncated': False} 1327 Ha ocurrido el siguiente error durante el reset: timed out 1328 Intentando volver a crear el entorno... 1329 Attempted to send kill command to minecraft process and failed with exception timed out 1330 Nuevo entorno creado!!! 1331 Ha ocurrido el siguiente error durante el reset: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices 1332 Intentando volver a crear el entorno... 1333 Nuevo entorno creado!!! 1334 Ha ocurrido el siguiente error durante el reset: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices 1335 Intentando volver a crear el entorno... 1336 Nuevo entorno creado!!! 1337 Ha ocurrido el siguiente error durante el reset: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices 1338 Intentando volver a crear el entorno... 1339 Nuevo entorno creado!!! 1340 Ha ocurrido el siguiente error durante el reset: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices 1341 Intentando volver a crear el entorno... 1342 Nuevo entorno creado!!!

Aug 07 '23 12:08 Sanfee18

I would add additional try-except catch inside the make_env, maybe with a loop to try for few times (with a 60s delay), and if then the environment fails to start properly, give up and crash. Apart from that hard to say what is happening. It sounds like there is something going wrong after training for long enough (e.g., running out RAM), given the environment crashes at very specific point. I would investigate that further.

Aug 08 '23 20:08 Miffyli

Hi I'm back.

First of all, since you pointed out that it could be a memory issue, I've moved the executions to a HPC cluster, to avoid those types of issues. For that I had to create a Singularity container based on the eg-docker repository that you have on MineRL documentation. Maybe I should create a repository with the files to reproduce the container, in case someone wants to use this type of environments is able to copy it and have everything working.

On the other hand, once I got the Singularity container working I executed my python code and I still got the same Time out error at the same timesteps number (around 120k-130k) with the try-except as you told me. Since I'm using Wandb for the monitoring of the process, I have access to the system metrics and CPU usage was fine, disk usage gets to 82%, only 7GB out of the 48GB of GPU VRAM were used and the program is only using 5GB of RAM. So I can assume the problem has nothing to do with the computer resources.

Also it's strange that given the other computer I was executing things on and the HPC cluster, they both have the same problem at the same timesteps. Maybe it has something to do with MineRL itself?

Aug 16 '23 10:08 Sanfee18

Heya.

Re docker: yup sharing results and environments is always good and helps anyone having trouble with things! I would happily also accept PR that adds link to your repo to MineRL docs, if you have the time to create one :)

Re timeouts: huh sounds indeed something that could be off with MineRL. Again, it does experience crashes regardless of the underlying system, but I did not expect it to be so regular across machines. One could spend time debugging this, but realistically, this might be intrinsic to Minecraft as well as we are using it in very unintended ways (re-creating worlds many times). Finding out the core issue could be a big effort (or not, hard to say without knowing where to even look at 😅 )

A quick remedy is to wrap things in try-except and/or recreate the environment at regular intervals.

Aug 17 '23 19:08 Miffyli

Hi,

Re PR: I've done the pull request with the additions to the documentation, check it out!

Re executions: Okay, I'm going to try the recreation every regular intervals and if that doesn't work I'm going to try different things to see if I find a solution. Thanks for your support, I will comment on this issue if I find any solution :)

Aug 19 '23 19:08 Sanfee18

Hi,

I have good news!

I've finally solved the time out error by downgrading the MineRL library to the 0.4.4 version. More than a month for such an easy solution :_)

I've also found that the training, with the same python code, the same environment and the same computer, is now a 35% faster. With the 1.0.0 version, the 120k timesteps where I got the error took roughly 2 hours. Now with the 0.4.4 version, it took 1 hour and 20 minutes.

I hope this thread helped someone and at least it brought some good things such as the Singularity container.

Now I'm able to finish my Bachelor's Thesis and hopefully graduate :)

Feel free to answer and close the issue and thank you so much for your support, it's awesome that you are helping us every day <3

Aug 26 '23 12:08 Sanfee18

Nice work @Sanfee18 ! Do note that 0.4.4 is quite a bit different, but if it sufficies for your work, then I'd recommend it indeed :). 0.3.7 might even be faster if that still applies to your case.

Aug 28 '23 17:08 Miffyli

Hi @Sanfee18, have you ever been able to train the ppo agent? I tried to use your code with minerlv1.0.1 but the reward remained zero for 110k steps.

Or, have you met the reward issue here?

Jan 05 '24 01:01 huangdi95

Hi @huangdi95 , getting rewards on PPO was almost impossible. I got some, but I feel like it was because of the actions mapping, that occasionally set the camera pointing towards a tree and it eventually broke some wood. That’s all I can tell you.

I wouldn’t bother trying to get rewards out of RL alone, I couldn’t get any result.

Jan 08 '24 12:01 Sanfee18

Thanks! I see.

Jan 09 '24 02:01 huangdi95

minerl minerl copied to clipboard

TimeoutError: timed out

minerl
minerl copied to clipboard