daniel servén
daniel servén
Also, since lambda is symmetric, consider storing only the lower triangular part.
made change to diagonals... havent run across the undefined loss problem again, yet...
inf loss is happening when we use very large matrices. in this case the determinant is overflowing! use `np.linalg.slogdet`
the loss should NEVER increase due to a Theta update because the objective is convex and we have a closed-form solution for the update.
i like some of the ideas in this example. visualizing the relationships in the stock market http://scikit-learn.org/stable/auto_examples/applications/plot_stock_market.html#sphx-glr-auto-examples-applications-plot-stock-market-py
Is this computation necessary? can we leverage a previous matrix product?
there are lots of implicit Gamma computations in the check descent loop. these should be cached if possible!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
FIX THIS!! the coord descent loop is so wasteful >:0 !!!
look up the stopping criterion that uses norm of gradient and norm of loss!
sometimes, the algorithm will propose a step direction, and then spend a long time backtracking until the learning rate is machine epsilon. perhaps instead of taking a really small step...