TextWorld icon indicating copy to clipboard operation
TextWorld copied to clipboard

Steps with multiple commands in them will crash the gym

Open dfoxfranke opened this issue 10 months ago • 2 comments

Inform's parser allows multiple commands to be expressed on a single line, e.g. "GO NORTH. GO EAST." or "DROP PEPPER THEN GET KNIFE". Calling env.step with such an argument will cause an exception in Inform7Data._gather_infos.

dfoxfranke avatar Feb 17 '25 23:02 dfoxfranke

Interesting. Do you have a minimal working example I can use to debug (i.e., the generated files .json and .z8)?

Looking back at TextWorld's code, it should be able to parse multiple commands (at least part of the codebase is able to): https://github.com/microsoft/TextWorld/blob/main/textworld/envs/wrappers/tw_inform7.py#L300-L313

MarcCote avatar Feb 18 '25 12:02 MarcCote

The attached game was generated via tw-make tw-cooking --recipe 3 --take 2 --go 12 --open --cook --cut --drop --seed 1: ecf1762719a475c1fe7a8fce3c09bddc519a3f598d9afa7ece93f01ca0f096db-1.zip. TextWorld was installed by executing !{sys.executable} -m pip install textworld from a default Colab environment.

import textworld, textworld.gym

def crash_demo(path):
    request_infos = textworld.core.EnvInfos(lost=True, won=True)
    env_id = textworld.gym.register_game(
        path, request_infos=request_infos,
    )

    game_env = textworld.gym.make(env_id)
    game_env.reset()

    game_env.step("OPEN SLIDING PATIO DOOR THEN GO NORTH")

Calling crash_demo on this game yields:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-55-db19323702ae>](https://localhost:8080/#) in <cell line: 0>()
----> 1 crash_demo(make_game(1, TW_MAKE_ARGS))

12 frames
[<ipython-input-50-505e40e69955>](https://localhost:8080/#) in crash_demo(path)
      8     game_env.reset()
      9 
---> 10     game_env.step("OPEN SLIDING PATIO DOOR THEN GO NORTH")

[/usr/local/lib/python3.11/dist-packages/textworld/gym/envs/textworld.py](https://localhost:8080/#) in step(self, command)
     63             * infos: additional information as requested.
     64         """
---> 65         obs, scores, dones, infos = super().step([command])
     66         return obs[0], scores[0], dones[0], {k: v[0] for k, v in infos.items()}

[/usr/local/lib/python3.11/dist-packages/textworld/gym/envs/textworld_batch.py](https://localhost:8080/#) in step(self, commands)
    148 
    149         self.last_commands = commands
--> 150         self.obs, scores, dones, infos = self.batch_env.step(self.last_commands)
    151         return self.obs, scores, dones, infos
    152 

[/usr/local/lib/python3.11/dist-packages/textworld/envs/batch/batch_env.py](https://localhost:8080/#) in step(self, actions)
    269                 results.append((obs, reward, done, infos))
    270             else:
--> 271                 results.append(env.step(action))
    272 
    273         self.last = results

[/usr/local/lib/python3.11/dist-packages/textworld/envs/wrappers/filter.py](https://localhost:8080/#) in step(self, command)
     44 
     45     def step(self, command: str) -> Tuple[str, Mapping[str, Any]]:
---> 46         game_state, score, done = super().step(command)
     47         ob = game_state.feedback
     48         infos = self._get_requested_infos(game_state)

[/usr/local/lib/python3.11/dist-packages/textworld/core.py](https://localhost:8080/#) in step(self, command)
    347 
    348     def step(self, command: str) -> Tuple[GameState, float, bool]:
--> 349         return self._wrapped_env.step(command)
    350 
    351     def reset(self) -> GameState:

[/usr/local/lib/python3.11/dist-packages/textworld/envs/wrappers/limit.py](https://localhost:8080/#) in step(self, command)
     25 
     26     def step(self, command: str) -> Tuple[str, Mapping[str, Any]]:
---> 27         game_state, score, done = super().step(command)
     28         self.nb_steps += 1
     29         done |= self.nb_steps >= self.max_episode_steps

[/usr/local/lib/python3.11/dist-packages/textworld/core.py](https://localhost:8080/#) in step(self, command)
    347 
    348     def step(self, command: str) -> Tuple[GameState, float, bool]:
--> 349         return self._wrapped_env.step(command)
    350 
    351     def reset(self) -> GameState:

[/usr/local/lib/python3.11/dist-packages/textworld/core.py](https://localhost:8080/#) in step(self, command)
    347 
    348     def step(self, command: str) -> Tuple[GameState, float, bool]:
--> 349         return self._wrapped_env.step(command)
    350 
    351     def reset(self) -> GameState:

[/usr/local/lib/python3.11/dist-packages/textworld/core.py](https://localhost:8080/#) in step(self, command)
    347 
    348     def step(self, command: str) -> Tuple[GameState, float, bool]:
--> 349         return self._wrapped_env.step(command)
    350 
    351     def reset(self) -> GameState:

[/usr/local/lib/python3.11/dist-packages/textworld/envs/wrappers/tw_inform7.py](https://localhost:8080/#) in step(self, command)
    293 
    294     def step(self, command: str):
--> 295         self.state, score, done = self._wrapped_env.step(command)
    296         if not self.tracking:
    297             return self.state, score, done  # State tracking not needed.

[/usr/local/lib/python3.11/dist-packages/textworld/envs/wrappers/tw_inform7.py](https://localhost:8080/#) in step(self, command)
    146         extra_infos, self.state["feedback"] = _detect_extra_infos(self.state["feedback"], self._tracked_infos)
    147         self.state.update(extra_infos)
--> 148         self._gather_infos()
    149         self.state["done"] = self.state["won"] or self.state["lost"]
    150         return self.state, self.state["score"], self.state["done"]

[/usr/local/lib/python3.11/dist-packages/textworld/envs/wrappers/tw_inform7.py](https://localhost:8080/#) in _gather_infos(self)
    136         for info in ["score", "moves"]:
    137             if self.state[info] is not None and type(self.state[info]) is not int:
--> 138                 self.state[info] = int(self.state[info].strip())
    139 
    140         self.state["won"] = '*** The End ***' in self.state["feedback"]

ValueError: invalid literal for int() with base 10: '1\n</moves>\n-= Corridor =-\nYou find yourself in a corridor. A typical kind of place. You try to gain information on your surroundings by using a technique you call "looking."\n\n\n\nThere is an ope

dfoxfranke avatar Feb 18 '25 14:02 dfoxfranke