hypre
hypre copied to clipboard
[Multivec 3/5]: Sequential SpMV updates
This is part of a series of PRs for enabling BoomerAMG to be applied to multivectors.
Changes in this PR:
- Extends hypre's SpMV on GPUs to work with multivectors.
- Improves hypre's SpMV on CPUs when using multivectors.
- Adds calls to cusparse's SpMM
Regression tests:
- [x] Tux
- [x] Lassen
Times (in seconds) for 1000 SpMV (only one vector component) with 7-pt stencil Poisson operator:
| Nx | Ny | Nz | DOFs | Nonzeros | master | multivec | cuSPARSE 11.2.0 |
|---|---|---|---|---|---|---|---|
| 64 | 64 | 64 | 262,144 | 1,810,432 | 0.041974 | 0.044754 | 0.061116 |
| 96 | 96 | 96 | 884,736 | 6,137,856 | 0.125632 | 0.129209 | 0.165329 |
| 128 | 128 | 128 | 2,097,152 | 14,581,760 | 0.288029 | 0.286971 | 0.344246 |
| 160 | 160 | 160 | 4,096,000 | 28,518,400 | 0.552682 | 0.552286 | 0.645487 |
| 192 | 192 | 192 | 7,077,888 | 49,324,032 | 0.951319 | 0.946958 | 1.091524 |
Times (in seconds) for 1000 SpMV (only one vector component) with 27-pt stencil Poisson operator:
| Nx | Ny | Nz | DOFs | Nonzeros | master | multivec | cuSPARSE 11.2.0 |
|---|---|---|---|---|---|---|---|
| 64 | 64 | 64 | 262,144 | 6,859,000 | 0.120503 | 0.122454 | 0.135166 |
| 96 | 96 | 96 | 884,736 | 23,393,656 | 0.388851 | 0.389597 | 0.435317 |
| 128 | 128 | 128 | 2,097,152 | 55,742,968 | 0.983655 | 0.985481 | 0.9982 |
| 160 | 160 | 160 | 4,096,000 | 109,215,352 | 2.011068 | 2.021667 | 1.950247 |
| 192 | 192 | 192 | 7,077,888 | 189,119,224 | 3.502067 | 3.512579 | 3.392852 |
Legend:
-
master: uses master branch -
multivec: this branch - Platform: one V100 (Lassen)