batch_rl Reading atari files directly.

Hi, contributing this example of how to read the atari files directly, in case anyone wants to do that.

Note that the data is stored in the same temporal sequence it was logged, as you can see by watching the replay.

import gzip
import cv2
import numpy as np

STORE_FILENAME_PREFIX = '$store$_'

ELEMS = ['observation', 'action', 'reward', 'terminal']

if __name__ == '__main__':

    data = {}

    data_dir = '/home/duane/data/dqn/Breakout/1/replay_logs/'
    suffix = 0
    for elem in ELEMS:
        filename = f'{data_dir}{STORE_FILENAME_PREFIX}{elem}_ckpt.{suffix}.gz'
        with open(filename, 'rb') as f:
            with gzip.GzipFile(fileobj=f) as infile:
                data[elem] = np.load(infile)

    for obs in data['observation']:
        cv2.imshow('obs', obs)
        cv2.waitKey(20)

May 13 '21 18:05 DuaneNielsen

Here is another implementation.

from dopamine.discrete_domains import atari_lib
from dopamine.replay_memory import circular_replay_buffer
class DummyWrappedBuffer(circular_replay_buffer.OutOfGraphReplayBuffer):
    def __init__(self,
               observation_shape,
               stack_size,
               replay_capacity=1000000,
               batch_size=32):
        super(DummyWrappedBuffer, self).__init__(observation_shape, stack_size, replay_capacity, batch_size)

def load_logs(logs_dir, suffix):
    buffer = DummyWrappedBuffer(
    observation_shape=atari_lib.NATURE_DQN_OBSERVATION_SHAPE,
    stack_size=atari_lib.NATURE_DQN_STACK_SIZE)
    buffer.load(checkpoint_dir=logs_dir, suffix=suffix)
    # A dict with keys: observation, action, reward, terminal
    return buffer._store
if __name__ == '__main__':
    log_path = osp.join(log_base, game, log_split, "replay_logs")
    store = load_logs(log_path, "0")

May 17 '21 14:05 Altriaex

Thank you for great works. But I wonder whether the dataset cumulates sequentially. That is, along the implementation of Duane, whether data["observation"][1] indicates the state after we take data["action"][0] in data["observation"][0].

Additionally, did you know is there a similar dataset implemented with torch, not a dopamine?

Nov 24 '21 00:11 jsw7460

Yes, the dataset is composed of (s, a, r, s') tuples like the way you indicated (you can visualize the data in a colab/jupyter notebook). For a Pytorch version of the dataset, you can use the code for this NeurIPS'21 paper: https://github.com/mila-iqia/SGI/blob/master/src/offline_dataset.py.

Nov 24 '21 01:11 agarwl

batch_rl batch_rl copied to clipboard

Reading atari files directly.

batch_rl
batch_rl copied to clipboard