atari-representation-learning icon indicating copy to clipboard operation
atari-representation-learning copied to clipboard

Incorrect/ambiguous features in Seaquest

Open damnOblivious opened this issue 4 years ago • 3 comments

I feel the features extracted for seaquest are incorrect/ambiguous. In the attached image, there are

  1. 4 enemies with one of them being at a different position than the others but the feature for enemy_obstacle is same for all 4 of them i.e. 96.
  2. Also as we can see, there is no diver in the current frame, rather one of the enemies has shot a missile. The extractor is mis-labeling that missile as a diver in 'diver_x_0': 45.

image

{'labels': {'player_y': 13, 'oxygen_meter_value': 64, 'num_lives': 3, 'missile_direction': 0, 'diver_x_1': 0, 'player_direction': 0, 'diver_x_0': 45, 'player_x': 76, 'enemy_obstacle_x_3': 96, 'diver_x_3': 0, 'missile_x': 0, 'score_0': 0, 'diver_x_2': 0, 'enemy_obstacle_x_1': 96, 'enemy_obstacle_x_0': 96, 'score_1': 0, 'enemy_obstacle_x_2': 96, 'divers_collected_count': 0}, 'ale.lives': 4}

Request you to please look into this or let us know if we are misinterpreting anything here.

Thanks, Vaibhav

damnOblivious avatar Feb 23 '20 15:02 damnOblivious

Hi Vaibhav, Very observant of you!

  1. I looked into this issue and it seems that it is a weird artifact of the game. All four enemy subs are always given the same RAM value, which reliably changes as they move. There must be some other logic in the code of the games which determines how to offset the enemy subs in relation to this RAM value.

  2. Vaibhav, yes, very observant. You reminded me of another funny artifact of the game, which is that the diver and enemy missiles share the same RAM values. As you may notice, neither of them are both on the screen at once. Thus, in the example, you see 3 of the diver/missile RAM values are unused, so they are 0, but one of the enemies shot a missile so one of the diver RAM values is being used to encode the position of this missile. missile_x and missile_direction refer to the player's missile not the enemy missile, so I have updated the code to be more precise: see commit f1a63df4590b577c08fd22f43895264669082f11

Hope that resolves your issue! Evan

eracah avatar Feb 24 '20 16:02 eracah

Hi Evan

Thanks for your response.

I want to train my agent on Seaquest using the environment features (not the RAM).

It would be great if you can give me some pointers / documentation to decipher the RAM values to extract these features. Also, please let me know if there is any other way I can get those features (just for Seaquest), for example if someone has handcrafted them.

Any help would be highly appreciated.

Thanks, Vaibhav

damnOblivious avatar Feb 24 '20 21:02 damnOblivious

By environment features, do you mean the subset of RAM that we identified to be meaningful? If yes, then check out https://github.com/mila-iqia/atari-representation-learning/issues/40 which has some details on how to extract info from RAM values.

Also heads up, someone else also tried training agents on environment features alone (for ex. location of the paddle and ball in Pong), but that doesn't seem to be enough. I am copying my reply here as to why that might be the case:

Check out Table D.1 in Bellamare et al. 2012, which has a column with results by training only on the RAM. There could be multiple reasons by training only on RAM or the location info is not sufficient:

  • The spatial information is more explicit in pixel inputs.
  • It lacks relations b/w objects / an interaction module as you mentioned. See https://arxiv.org/abs/1911.12247 which uses an explicit interaction module on top of object representations. Another paper that found that using MLPs on top of features isn't enough, and you might need a GNN: https://openreview.net/forum?id=S1sqHMZCb

ankeshanand avatar Feb 24 '20 22:02 ankeshanand