matrix icon indicating copy to clipboard operation
matrix copied to clipboard

feat: optimize SVD ?

Open lpatiny opened this issue 1 year ago • 1 comments

The order of processing loops may be critical in order to optimize speed.

The implemented algorithms EVD, SVD, etc may assume a specific row / col or col / row order. The goal being that in the inner loops the memory is accessed sequentially.

By simply inverting the get / set in matrix (just using square matrices not to check everything) we observe major time difference (in ms)

The best improvement was observe for a SVD of 1000x1000 elements (x3 !) but the inversion can also lead to worse results like for LuDecomposition (/2 ...).

Inverted

┌─────────┬──────┬────────────────────────────┬──────────────────────┬─────────────────────────┐
│ (index) │ size │ SingularValueDecomposition │   LuDecomposition    │ EigenvalueDecomposition │
├─────────┼──────┼────────────────────────────┼──────────────────────┼─────────────────────────┤
│    0    │  10  │    0.047006314899820094    │ 0.006285565016191445 │   0.04497034566020066   │
│    1    │ 100  │     9.556330035184766      │  0.5338207964597829  │    11.88394686695906    │
│    2    │ 500  │     964.0846458325783      │  57.523048850654185  │   1427.5238334983587    │
│    3    │ 1000 │     7546.173541992903      │  563.8283982227246   │   15070.775082990527    │
│    4    │ 1500 │     25270.795874997973     │  2161.1955276678004  │    39320.5441249907     │
└─────────┴──────┴────────────────────────────┴──────────────────────┴─────────────────────────┘

Non inverted

┌─────────┬──────┬────────────────────────────┬───────────────────────┬─────────────────────────┐
│ (index) │ size │ SingularValueDecomposition │    LuDecomposition    │ EigenvalueDecomposition │
├─────────┼──────┼────────────────────────────┼───────────────────────┼─────────────────────────┤
│    0    │  10  │    0.028381821972091822    │ 0.0047396303146609135 │  0.029047660718630176   │
│    1    │ 100  │     7.579296062081497      │  0.39713138298076744  │   11.749581896539393    │
│    2    │ 500  │     1620.0299582481384     │  38.316282882820815   │   1338.5159477517009    │
│    3    │ 1000 │     22116.20337499678      │   307.1329264702166   │   19482.321583002806    │
│    4    │ 1500 │     44339.38079200685      │  1027.2492165982724   │    41522.77608400583    │
└─────────┴──────┴────────────────────────────┴───────────────────────┴─────────────────────────┘

This is just at the stage of reflection that could lead to some speed improvement.

lpatiny avatar Sep 29 '23 13:09 lpatiny