rlpyt Support for weird graph-based observation data type

Hello, I would like to use your framework in my research for its multithreading features but I have a bit of a weird MDP. the state is a pytorch_geometric graph-structured data object and a variable length array. The action space is also a variable length array. Is there an easy way I can make this work with your framework?

Dec 20 '19 23:12 tarungog

tl;dr will this library work with arbitrary observation data types?

Dec 21 '19 03:12 tarungog

Hi, interesting question! One challenge to this is that memory is pre-allocated for the observations and actions, according to the sampler batch size. So it can't have variable-sized observations or actions, directly. But if you can specify a maximum length ahead of time, and deal with having trailing zeros (or whatever non-value, maybe even NaN), then that could work.

Hmm yes this is an interesting case, to support graph neural networks, which take in variable-length observations...

Let us know what you try?

Dec 21 '19 17:12 astooke

@astooke where is this memory pre-allocated and where could I modify it? could you give me some code pointers

Dec 21 '19 23:12 tarungog

Also are there better visualization tools that will work with this data format than viskit? Unfortunately this viskit software is quite nascent and underwhelming

Dec 21 '19 23:12 tarungog

where is this memory pre-allocated and where could I modify it?

Sure! Here it is in the serial sampler: https://github.com/astooke/rlpyt/blob/75e96cda433626868fd2a30058be67b99bbad810/rlpyt/samplers/serial/sampler.py#L36
Otherwise look for build_samples_buffer() and see inside that.

Also are there better visualization tools that will work with this data format than viskit?

Good question, want to move it to its own issue thread so others might see it and comment? I use viskit and find it pretty good for separating hyperparameters, but there is a pending pull request for tensorboard (minor changes only, should be merged soon). The format is: data from each experiment recorded in a CSV which sits in its own folder, along with a configuration json, which includes a run_ID for multiple runs launched with the same hyperparameters.

Dec 22 '19 17:12 astooke

@tarungog Hi! Curious if you pursued anything for variable-sized observations and actions?

Mar 02 '20 22:03 astooke

rlpyt rlpyt copied to clipboard

Support for weird graph-based observation data type

rlpyt
rlpyt copied to clipboard