droiter
Results
2
issues of
droiter
2018-06-22 511280 中期信用ETF 的份额为0
I think td_error in AC is same with advantage in baseline solution, which are all reward minus predicted value. One difference is AC value network is learning in TD, baseline...