photon-ml icon indicating copy to clipboard operation
photon-ml copied to clipboard

Add normalization to HessianDiagonalAggregator

Open ashelkovnykov opened this issue 8 years ago • 6 comments

From a comment by @XianXing on PR #131:

"The computation and approximation for Hessian diagonal was designed and implemented independent of Photon's normalization, because previously we didn't enable normalization for GLMix.

But it will be helpful if we take the normalization into account as well here for Hessian diagonal, in order to make it consistent with the rest of code."

ashelkovnykov avatar Oct 19 '16 18:10 ashelkovnykov

Assigning to myself as I am working on normalization right now.

@XianXing - can you review this PR, and let me know if it is actually incorrect from an algorithm point of view, if we don't normalize the Hessian diagonal when normalization is turned on?

fastier-li avatar Mar 16 '17 14:03 fastier-li

The PR looks good to me in general, but I haven't got a chance to look at it carefully.

Can you have other folks to take a look at the PR as well, for example @ankans or Ajith, so that you guys get the help needed and won't get blocked?

XianXing avatar Mar 16 '17 15:03 XianXing

Yes, we are having it reviewed by Ankan too. What it does is only setup normalization contexts in GAME and then call the same machinery as photon. If it was correct in photon after the normalization contexts were set, it should be correct in GAME, that hasn't changed at all. What happens to the Hessian diagonal in photon, in case of normalization?

fastier-li avatar Mar 16 '17 16:03 fastier-li

@XianXing - what is the real impact of not having this? Is it correctness? Is it performance? Is it required to compute variances correctly?

fastier-li avatar Mar 16 '17 17:03 fastier-li

Correctness. The impact would be downstream applications. For example, I don't know if we are using the estimated variance for multi-armed bandit and explore-exploit tradeoffs at LNKD or not, but based on our experience the quality of estimated variance would have direct impact on those applications (can't be more specific than this without getting into trouble :) ).

Having said that, if there are no such use cases at LNKD, please feel free to de-prioritize it.

@pogil @beechung

XianXing avatar Mar 16 '17 18:03 XianXing

So, it is correctness of variances, only. The correctness of coefficient means themselves does not depend on that, right?

fastier-li avatar Mar 16 '17 18:03 fastier-li