Is there a way to reduce solve setup time?
Ax=b Calculate once, same as solve value in CPU when "max_level =1"
But slower than CPU
Increasing "max_level" reduces setup time but increases iter.
Is there a way to reduce solve setup time?
Or can you save time by removing other unnecessary settings?
NVIDIA A10 , AMGX version 2.4.0 Built on Nov 13 2023, 18:10:19 Compiled with CUDA Runtime 11.4, using CUDA driver 11.4 The AMGX_initialize_plugins API call is deprecated and can be safely removed.
FGMRES_AGGREGATION.json
Config max_level = 1
AMG Grid: Number of Levels: 1 LVL ROWS NNZ PARTS SPRSTY Mem (GB) ---------------------------------------------------------------------- 0(D) 8423 36031 1 0.000508 0.000528 ---------------------------------------------------------------------- Grid Complexity: 1 Operator Complexity: 1 Total Memory Usage: 0.000528205 GB ---------------------------------------------------------------------- iter Mem Usage (GB) residual rate ---------------------------------------------------------------------- Ini 2.18726 9.093929e+02 0 2.18726 2.781798e-11 0.0000 ---------------------------------------------------------------------- Total Iterations: 1 Avg Convergence Rate: 0.0000 Final Residual: 2.781798e-11 Total Reduction in Residual: 3.058962e-14 Maximum Memory Usage: 2.187 GB ---------------------------------------------------------------------- Total Time: 1.10915 setup: 1.10528 s solve: 0.00387482 s solve(per iteration): 0.00387482 s
Config max_level = 100 이면
AMG Grid: Number of Levels: 4 LVL ROWS NNZ PARTS SPRSTY Mem (GB) ---------------------------------------------------------------------- 0(D) 8423 36031 1 0.000508 0.000611 1(D) 1371 9123 1 0.00485 0.000264 2(D) 592 4060 1 0.0116 0.000117 3(D) 263 1715 1 0.0248 4.62e-05 ---------------------------------------------------------------------- Grid Complexity: 1.26428 Operator Complexity: 1.41348 Total Memory Usage: 0.00103818 GB
----------------------------------------------------------------------
Total Iterations: 788
Avg Convergence Rate: 0.9654
Final Residual: 8.014293e-10
Total Reduction in Residual: 8.812795e-13
Maximum Memory Usage: 1.023 GB
----------------------------------------------------------------------
Total Time: 3.97606 setup: 0.00537088 s solve: 3.97069 s solve(per iteration): 0.00503894 s
"use_scalar_norm": 1,
"print_solve_stats": 1,
"solver": "FGMRES",
"obtain_timings": 1,
"max_iters": 1000,
"monitor_residual": 1,
"gmres_n_restart": 400,
"convergence": "RELATIVE_INI_CORE",
"scope": "main",
"tolerance": 1e-08,
"norm": "L1"
"max_iters": 1000, , "gmres_n_restart": 400, This must be set so that the value is the same as when max_level = 1 Is there an effective way to set it up?
I actually can't reproduce results with your matrix and configs, it seems that matrix is pretty ill-conditioned too - can you confirm it?
Regardless, since you are running on A10 - I would recommend (if you didn't already) to switch to fp32 precision, as fp64 is almost non-existent there. You can add -mode dFFI to AMGX examples to enable it.
Looking at your output for
FGMRES_AGGREGATION.json
Config max_level = 1
do you want to construct just 1 additional level of multigrid, or skip it altogether (and do something like GMRES, and Jacobi as preconditioner)? Setup time around one second is a sign that something is wrong since multigrid should almost not be involved.
In your max_levels=4 i see that setup time is actually less than with max_level=1, which another sign that something's fishy :) Can you share both full configs of what you have used?
For such small matrix (8k entries) other methods might be more effective too
Let us know if you have any more questions.