`TDLearner` time step parameter
TDLearner(;approximator, γ=1.0, method, n=0): the n in the constructor is strangely not the number of time steps used, but rather that number minus 1. this is really strange.
;(
I struggled on it too...
In the end, I decided to follow TD(λ) where (λ=0). So maybe better to rename the keyword argument name?
isn't TD(λ) separately defined in TDλReturnLearner?
the n here can just be the n as in n-step TD methods, no? it's simple enough to change, but i'm not sure how one would introduce a breaking change (though, funnily enough, the example in RLAnIntroduction.jl took the n to be the number of time steps, lol)
Hmm, let me examine it again when adding it back in the next release.