Or can you save time by removing other unnecessary settings?

NVIDIA A10 , AMGX version 2.4.0 Built on Nov 13 2023, 18:10:19 Compiled with CUDA Runtime 11.4, using CUDA driver 11.4 The AMGX_initialize_plugins API call is deprecated and can be safely removed.

FGMRES_AGGREGATION.json

Config max_level = 1

AMG Grid: Number of Levels: 1 LVL ROWS NNZ PARTS SPRSTY Mem (GB) ---------------------------------------------------------------------- 0(D) 8423 36031 1 0.000508 0.000528 ---------------------------------------------------------------------- Grid Complexity: 1 Operator Complexity: 1 Total Memory Usage: 0.000528205 GB ---------------------------------------------------------------------- iter Mem Usage (GB) residual rate ---------------------------------------------------------------------- Ini 2.18726 9.093929e+02 0 2.18726 2.781798e-11 0.0000 ---------------------------------------------------------------------- Total Iterations: 1 Avg Convergence Rate: 0.0000 Final Residual: 2.781798e-11 Total Reduction in Residual: 3.058962e-14 Maximum Memory Usage: 2.187 GB ---------------------------------------------------------------------- Total Time: 1.10915 setup: 1.10528 s solve: 0.00387482 s solve(per iteration): 0.00387482 s

Config max_level = 100 이면

AMG Grid: Number of Levels: 4 LVL ROWS NNZ PARTS SPRSTY Mem (GB) ---------------------------------------------------------------------- 0(D) 8423 36031 1 0.000508 0.000611 1(D) 1371 9123 1 0.00485 0.000264 2(D) 592 4060 1 0.0116 0.000117 3(D) 263 1715 1 0.0248 4.62e-05 ---------------------------------------------------------------------- Grid Complexity: 1.26428 Operator Complexity: 1.41348 Total Memory Usage: 0.00103818 GB

     ----------------------------------------------------------------------
     Total Iterations: 788
     Avg Convergence Rate: 		         0.9654
     Final Residual: 		   8.014293e-10
     Total Reduction in Residual: 	   8.812795e-13
     Maximum Memory Usage: 		          1.023 GB
     ----------------------------------------------------------------------

Total Time: 3.97606 setup: 0.00537088 s solve: 3.97069 s solve(per iteration): 0.00503894 s

    "use_scalar_norm": 1, 
    "print_solve_stats": 1, 
    "solver": "FGMRES", 
    "obtain_timings": 1, 
    "max_iters": 1000, 
    "monitor_residual": 1, 
    "gmres_n_restart": 400, 
    "convergence": "RELATIVE_INI_CORE", 
    "scope": "main", 
    "tolerance": 1e-08, 
    "norm": "L1"

"max_iters": 1000, , "gmres_n_restart": 400, This must be set so that the value is the same as when max_level = 1 Is there an effective way to set it up?

output_vectorX.txt Matrix.mtx.txt

Nov 27 '23 07:11 leemunseon

I actually can't reproduce results with your matrix and configs, it seems that matrix is pretty ill-conditioned too - can you confirm it?

Regardless, since you are running on A10 - I would recommend (if you didn't already) to switch to fp32 precision, as fp64 is almost non-existent there. You can add -mode dFFI to AMGX examples to enable it.

Looking at your output for

FGMRES_AGGREGATION.json
Config max_level = 1

do you want to construct just 1 additional level of multigrid, or skip it altogether (and do something like GMRES, and Jacobi as preconditioner)? Setup time around one second is a sign that something is wrong since multigrid should almost not be involved.

In your max_levels=4 i see that setup time is actually less than with max_level=1, which another sign that something's fishy :) Can you share both full configs of what you have used?

For such small matrix (8k entries) other methods might be more effective too

Dec 04 '23 18:12 marsaev

Let us know if you have any more questions.

Jul 04 '24 00:07 marsaev