glum
glum copied to clipboard
Add `diag_fisher` functionality to allow the Hessian to be computed "on the fly"
Instead of storing the entire Hessian matrix, diag_fisher=True calculates Hessian rows only when they must be used (i.e. in the relevant step of coordinate descent). It accomplishes this by storing the current state of the 1-D array hessian_rows (which appears in the middle of the sandwich product). This necessitates a greater number of matrix operations (many of them redundant), but also saves memory.
To-Do:
state.hessiandoes not need to be instantiated ifdiag_fisher=True; this is not yet accounted for in the current code.- Make sure we've addressed all entry points where the argument
diag_fishercan be set by the user. - How do we implement memory views (and possibly
nogil) in_cd_fast.pyx?