Nikhil Barhate comments

Repositories
Issues
Comments

Results 2 comments of


                                            Nikhil Barhate

Discounted Reward Calulcation (Generalized Advantage Estimation)

First, this repository does NOT use Generalized Advantage Estimation; it uses `monte-carlo estimate` for calculating `rewards_to_go` (`reward` variable in code) and `advantages` = `rewards_to_go` - `V(s_t)`. The only time we...

Training is ok, but failed to eval.

The Decision Transformer paper does not provide results for pointmaze environment, it is a difficult env and I would not expect DT to work well on it out of the...