stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

linalg: least squares

Open perazz opened this issue 2 months ago • 4 comments

Least squares solution to to system $Ax=b$, i.e. such that the 2-norm $||b-Ax||$ is minimized., based on LAPACK *GELSD functions.

  • [x] base implementation
  • [x] tests
  • [x] exclude unsupported xdp
  • [x] documentation
  • [x] submodule
  • [x] examples
  • [x] subroutine interface

Prior art:

  • NumPy: lstsq(a, b, rcond='warn')
  • Scipy: lstsq(a, b, cond=None, overwrite_a=False, overwrite_b=False, check_finite=True, lapack_driver=None)
  • IMSL: Result = IMSL_QRSOL(B, [A] [, AUXQR] [, BASIS] [, /DOUBLE] [, QR] [, PIVOT] [, RESIDUAL] [, TOLERANCE])

Proposed implementation: generic interface for either 1 RHS or many RHS's

  • x = lstsq(a,b) -> simplest call
  • x = lstsq(a,b,cond,overwrite_a,rank,err) -> expert call:
    • real(kind(a)), intent(in), optional :: cond = optional cutoff for rank evaluation: singular values s(i)<=cond*maxval(s) are considered 0. Default: 0.0
    • logical(lk), optional, intent(in) :: overwrite_a = option to avoid internal allocation, but destroy input matrix a. Default: .false.
    • integer(ilp), optional, intent(out) :: rank = return rank of A, if requested
    • type(linalg_state_type), optional, intent(out) :: err = return state variable

cc: @jvdp1 @jalvesz @fortran-lang/stdlib I believe this is ready for consideration.

perazz avatar Apr 21 '24 13:04 perazz

Thank you all @jvdp1 @gnikit @jalvesz for the comments! I believe it's polished enough now. Regarding the allocation question @jalvesz, please let me know. Because there are 3/4 allocations per case, the subroutine interface would look similar to gelsd: something like call lstsq_solve(A,b,x,rwork,iwork,cwork,rank,err) so if one wants to handle the solution fully controlling it, maybe (thinking aloud) they could just use the gelsd LAPACK interface that we also provide? Otherwise, I'd say let's open a separate issue for this, so we can discuss further and find a suitable common ground

perazz avatar Apr 26 '24 06:04 perazz

Regarding the allocation question @jalvesz, please let me know. Because there are 3/4 allocations per case, the subroutine interface would look similar to gelsd: something like call lstsq_solve(A,b,x,rwork,iwork,cwork,rank,err) so if one wants to handle the solution fully controlling it, maybe (thinking aloud) they could just use the gelsd LAPACK interface that we also provide?

Indeed, full control on that regard can be done using the gelsd interface directly.

I'd say let's open a separate issue for this, so we can discuss further and find a suitable common ground

Yes, let's keep this idea in mind for latter on.

jalvesz avatar Apr 26 '24 12:04 jalvesz

@jvdp1 @jalvesz @gnikit I added the subroutine interface. This interface is now far more complicated, so I would like to ask your help deciding if the expert user interface has now too many control knobs? We now also have to export an interface to let the user compute what's the minimum working array space for all cases.

Here is the new interfaces I've prepared:

https://github.com/fortran-lang/stdlib/blob/2e370f45438a637c90b608609f6edc0ddbdc6ef6/src/stdlib_linalg.fypp#L344-L351

https://github.com/fortran-lang/stdlib/blob/2e370f45438a637c90b608609f6edc0ddbdc6ef6/src/stdlib_linalg.fypp#L295-L321

perazz avatar Apr 29 '24 08:04 perazz

I'll try to take a close look at this soon @perazz, great work!! This makes me think about the issues when one wants to solve sparse linear systems with different preconditioners, one also needs to create a work space for the preconditioner. So I think it is worth defining this notion properly now.

jalvesz avatar Apr 29 '24 10:04 jalvesz

Thank you @perazz . It sounds ready to be merged IMO. Do you agree? @jalvesz @gnikit ?

jvdp1 avatar May 07 '24 06:05 jvdp1