Nikhil Barhate

Results 2 comments of Nikhil Barhate

First, this repository does NOT use Generalized Advantage Estimation; it uses `monte-carlo estimate` for calculating `rewards_to_go` (`reward` variable in code) and `advantages` = `rewards_to_go` - `V(s_t)`. The only time we...

The Decision Transformer paper does not provide results for pointmaze environment, it is a difficult env and I would not expect DT to work well on it out of the...