lapack
lapack copied to clipboard
Call *rot to perform eigenvector update of *steqr
Call *rot instead of *lasr to perform eigenvector update of *steqr to fully utilize blas subroutines.
Code equivalence is verified with norm $\Vert{A-ZDZ^T}\Vert_2$. Entries is generated randomly.
Single Precision Case
| Dimension | Norm before PR | Norm after PR |
|---|---|---|
| 1000 | 8.621327e-05 | 8.621327e-05 |
| 2000 | 1.890181e-04 | 1.890181e-04 |
Double Precision Case
| Dimension | Norm before PR | Norm after PR |
|---|---|---|
| 1000 | 2.297470e-13 | 2.297470e-13 |
| 2000 | 4.402642e-13 | 4.402642e-13 |
Performance is measured in millisecond and shows an improvement. The platform is a Intel Xeon 1660 v3 @ 3GHz. As an additional test, we also measured the performance of subroutine *kteqr in PR #1049 .
Single Precision Case
| Dimension | Elapse before PR | Elapse after PR | Elapse of skteqr |
|---|---|---|---|
| 1000 | 4281 | 3703 | 2359 |
| 2000 | 37063 | 31625 | 18781 |
Double Precision Case
| Dimension | Elapse before PR | Elapse after PR | Elapse of dkteqr |
|---|---|---|---|
| 1000 | 6063 | 5234 | 2562 |
| 2000 | 47499 | 40406 | 22469 |