on-policy
on-policy copied to clipboard
Being confused about The huber_loss
the function huber_loss in utils is like:
def huber_loss(e, d):
a = (abs(e) <= d).float()
b = (e > d).float()
return a*e**2/2 + b*d*(abs(e)-d/2)
It may come with a zero loss when error is greater than huber_delta.
If I'm not mistaken,it should be
b = (abs(e) > d).float()
Looking forward to hearing from you.