aesara icon indicating copy to clipboard operation
aesara copied to clipboard

Add `Op` to solve batched linear matrix equation

Open purna135 opened this issue 2 years ago • 17 comments

Addressing #791

This PR adds Op to aesara.tensor.nlinalg that wraps numpy.linalg.solve for solving linear matrix equations, or system of linear scalar equations, and computing the reverse-mode gradients.

Now, since there is already a Solve in aesara.tensor.slinalg, why do we need this Numpy Solve again?

Scipy doesn't appear to have anything that can be used to solve a linear equation with a batch of triangular matrices, but we can deal with the batched data by using the np.vectorized version of the Scipy Solve. Numpy, on the other hand, has solve, which can handle batches of matrices but is expected to be slower.

So, to decide which one to look at, I create a vectorized version of aesara.tensor.slinalg.cholesky and compare its performance to another cholesky method that deals with batched data using numpy.linalg.solve.

The detailed investigation can be found at the link below.

  1. Cholesky - Numpy vs Scipy for Large Array
  2. Cholesky - Numpy vs Scipy for Small Array

In short, I obtained the following result:

  1. Runtime of Numpy vs Scipy for Large Array Runtime of Numpy vs Scipy for Large Array

  2. Runtime of Numpy vs Scipy for Small Array Runtime of Numpy vs Scipy for Large Array

As we can see, cholesky which uses numpy solve is more performant for reasonably sized arrays and has simpler code, so we decided to use numpy.linalg.solve instead of Scipy solve.

purna135 avatar Jul 18 '22 16:07 purna135

Codecov Report

Merging #1060 (e95fa12) into main (8794f48) will increase coverage by 0.01%. The diff coverage is 100.00%.

:exclamation: Current head e95fa12 differs from pull request most recent head 79e45be. Consider uploading reports for the commit 79e45be to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1060      +/-   ##
==========================================
+ Coverage   79.25%   79.27%   +0.01%     
==========================================
  Files         152      152              
  Lines       47875    47921      +46     
  Branches    10906    10911       +5     
==========================================
+ Hits        37942    37988      +46     
  Misses       7436     7436              
  Partials     2497     2497              
Impacted Files Coverage Δ
aesara/tensor/nlinalg.py 98.74% <100.00%> (+0.16%) :arrow_up:

codecov[bot] avatar Jul 18 '22 18:07 codecov[bot]

Hello, @brandonwillard. Please accept my apologies for forgetting to link my analysis on this; I have now updated the description of this PR with my findings on both solve methods. please take a look at them.

purna135 avatar Jul 31 '22 23:07 purna135

Hello, @brandonwillard. Please accept my apologies for forgetting to link my analysis on this; I have now updated the description of this PR with my findings on both solve methods. please take a look at them.

The questions I'm referring to have to do with how NumPy's solve is added. There's enough inherent overlap between the existing Solve and this one to prioritize code reuse, generalization, etc.

brandonwillard avatar Jul 31 '22 23:07 brandonwillard

Can we completely replace the existence solve with this Numpy solve ?

purna135 avatar Aug 01 '22 00:08 purna135

Can we completely replace the existence solve with this Numpy solve ?

I don't think so; I recall each as having distinct options/approaches.

brandonwillard avatar Aug 01 '22 00:08 brandonwillard

Hello, @brandonwillard. Please accept my apologies for forgetting to link my analysis on this; I have now updated the description of this PR with my findings on both solve methods. please take a look at them.

@purna135 did you benchmark cholesky or solve? I remember you only analysing cholesky and that's what your linked notebook does.

It seems though that scipy cholesky also offers one argument that numpy does not (lower), not sure how important is to keep that one around. Anyway that's not the focus of this PR. Edit: the default of scipy and numpy are not even the same :/

ricardoV94 avatar Aug 01 '22 06:08 ricardoV94

For solve, since scipy has many more options than numpy, I would think that the vectorize approach is more straightforward.

ricardoV94 avatar Aug 01 '22 06:08 ricardoV94

@purna135 did you benchmark cholesky or solve?

Yes, I only tested cholesky, but since cholesky is dependent on solve, I thought it would be better to work on solve in a separate PR.

purna135 avatar Aug 01 '22 07:08 purna135

@purna135 did you benchmark cholesky or solve?

Yes, I only tested cholesky, but since cholesky is dependent on solve, I thought it would be better to work on solve in a separate PR.

I agree, it's just that the linked notebook doesn't matter for this PR because it doesn't say anything about Solve

ricardoV94 avatar Aug 01 '22 08:08 ricardoV94

@purna135 can you do similar analysis benchmarking the scipy and numpy solve Ops?

Sayam753 avatar Aug 02 '22 06:08 Sayam753

Yes, let me test solve op.

purna135 avatar Aug 02 '22 06:08 purna135

I think the benchmark doesn't matter as much here, because scipy has more options that have no numpy equivalent.

ricardoV94 avatar Aug 02 '22 08:08 ricardoV94

@purna135 let's revisit the scipy solve Op and explore the option of add broadcasting there.

Sayam753 avatar Aug 02 '22 09:08 Sayam753

@purna135 let's revisit the scipy solve Op and explore the option of add broadcasting there. Ok

purna135 avatar Aug 02 '22 09:08 purna135

Although the benchmark is no longer relevant, I tested both solve methods out of curiosity and found that the Numpy implementation of solve is 2-3 times faster than the vectorized version of Scipy solve.

The detailed analysis can be found here: https://github.com/purna135/GSoC-Learnings/blob/main/Performance%20Comparison%20for%20Solve%20method%20-%20Numpy%20vs%20Scipy.ipynb

image

purna135 avatar Aug 02 '22 12:08 purna135

Although the benchmark is no longer relevant, I tested both solve methods out of curiosity and found that the Numpy implementation of solve is 2-3 times faster than the vectorized version of Scipy solve.

A numpy.vectorized version of a ufunc or C-level vectorized implementation will almost necessarily be slower (e.g. because the former is just a Python for-loop with additional bookkeeping) , so that's to be expected.

Aside from that, it seems like those comparisons are only demonstrating the difference between the implementations and/or defaults of numpy.linalg.solve and scipy.linalg.solve for sparse/broadcasted identity matrices. Does that sound correct?

brandonwillard avatar Aug 04 '22 19:08 brandonwillard

Aside from that, it seems like those comparisons are only demonstrating the difference between the implementations and/or defaults of numpy.linalg.solve and scipy.linalg.solve for sparse/broadcasted identity matrices. Does that sound correct?

Yes, exactly

purna135 avatar Aug 05 '22 12:08 purna135