rlpyt
rlpyt copied to clipboard
Support for weird graph-based observation data type
Hello, I would like to use your framework in my research for its multithreading features but I have a bit of a weird MDP. the state is a pytorch_geometric graph-structured data object and a variable length array. The action space is also a variable length array. Is there an easy way I can make this work with your framework?
tl;dr will this library work with arbitrary observation data types?
Hi, interesting question! One challenge to this is that memory is pre-allocated for the observations and actions, according to the sampler batch size. So it can't have variable-sized observations or actions, directly. But if you can specify a maximum length ahead of time, and deal with having trailing zeros (or whatever non-value, maybe even NaN), then that could work.
Hmm yes this is an interesting case, to support graph neural networks, which take in variable-length observations...
Let us know what you try?
@astooke where is this memory pre-allocated and where could I modify it? could you give me some code pointers
Also are there better visualization tools that will work with this data format than viskit? Unfortunately this viskit software is quite nascent and underwhelming
where is this memory pre-allocated and where could I modify it?
Sure! Here it is in the serial sampler:
https://github.com/astooke/rlpyt/blob/75e96cda433626868fd2a30058be67b99bbad810/rlpyt/samplers/serial/sampler.py#L36
Otherwise look for build_samples_buffer()
and see inside that.
Also are there better visualization tools that will work with this data format than viskit?
Good question, want to move it to its own issue thread so others might see it and comment? I use viskit and find it pretty good for separating hyperparameters, but there is a pending pull request for tensorboard (minor changes only, should be merged soon). The format is: data from each experiment recorded in a CSV which sits in its own folder, along with a configuration json, which includes a run_ID
for multiple runs launched with the same hyperparameters.
@tarungog Hi! Curious if you pursued anything for variable-sized observations and actions?