mars icon indicating copy to clipboard operation
mars copied to clipboard

Improve the performance of `glm.LogisticRegression`

Open Fernadoo opened this issue 4 years ago • 0 comments

Current implementation could be found here #2466. In fact, iteratively calling stochastic gradient descent is quite inefficient for distributed frameworks like Mars.

Potential solutions could be:

  1. Zhuang, Yong, et al. "Distributed newton methods for regularized logistic regression." Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Cham, 2015.
  2. Gopal, Siddharth, and Yiming Yang. "Distributed training of large-scale logistic models." International Conference on Machine Learning. PMLR, 2013.

Existing implementation of optimization algorithms that could be referred to:

  • https://github.com/dask/dask-glm/blob/main/dask_glm/algorithms.py

Fernadoo avatar Oct 09 '21 09:10 Fernadoo