imitation
imitation copied to clipboard
Rewards visualization
Problem
Comparing Student and Expert -or- before and after demonstration improvements of the rewards space
Solution
Add some form of visualization to see the improvements in rewards. Possibly a heat-map of rewards before applying Imitation Learning and after applying.
Possible alternative solutions
Heat-map looks most feasible, other alternatives could be provide some form of rewards function as an output? This "approximated" (say upto 5th power) polynomial could then be provided for visualization as the developers may find appropriate.