GEOS icon indicating copy to clipboard operation
GEOS copied to clipboard

SEQ fails on GPUs

Open victorapm opened this issue 1 year ago • 4 comments

SEQ strategy is failing on GPUs as of https://github.com/GEOS-DEV/GEOS/commit/da6e83fc4f0f43a803b3300a3ec955ae0f404662

Example output for SEAM (coarse):

------------------- TIMESTEP START -------------------
    - Time:       -3168y, -319d, -4h-1m-4s (-100000000000 s)
    - Delta Time: 3168y, 319d, 04h01m04s (100000000000 s)
    - Cycle:      0
------------------------------------------------------


Time: -1.00e+11 s, dt: 100000000000 s, Cycle: 0

Task `ELASTICITY.PRE.INIT.STEP`: at time -100000000000s, physics solver `SEAM` is set to perform stress initialization during the next time step(s)
Task `ELASTICITY.PRE.INIT.STEP`: at time -100000000000s, physics solver `LINEAR.ELASTICITY` is resetting total displacement and velocity to zero
  Iteration  1: RESERVOIRANDWELL.SOLVER
    Attempt:  0, ConfigurationIter:  0, NewtonIter:  0
        ( Rflow ) = ( 0.00e+00 )        ( Rwell ) = ( 0.00e+00 )        ( R ) = ( 0.00e+00 )
        MGR preconditioner: numComponentsPerField = [3, 4]
        Last LinSolve(iter,res) = (   0, 0.00e+00 )
        FLOW.SOLVER: Max pressure change: 0.000 Pa (before scaling)
        FLOW.SOLVER: Max component density change: 0.000 kg/m3 (before scaling)
        FLOW.SOLVER: Min pressure scaling factor: 1
        FLOW.SOLVER: Min component density scaling factor: 1
        RESERVOIRANDWELL.SOLVER: Global solution scaling factor = 1
        FLOW.SOLVER: Max phase volume fraction change = 0.0000
    Attempt:  0, ConfigurationIter:  0, NewtonIter:  1
        ( Rflow ) = ( 0.00e+00 )        ( Rwell ) = ( 0.00e+00 )        ( R ) = ( 0.00e+00 )
  Iteration  1: LINEAR.ELASTICITY
    Attempt:  0, ConfigurationIter:  0, NewtonIter:  0
        ( Rsolid ) = ( 2.74e+01 )        ( R ) = ( 2.74e+01 )
***** ERROR
***** LOCATION: ${GEOS_PATH}/src/coreComponents/linearAlgebra/interfaces/hypre/HyprePreconditioner.cpp:445
***** Controlling expression (should be false): ierr != 0
***** Rank 2: Error in call to m_precond->destroy( m_precond->ptr )
Expected ierr == 0
  ierr = 1
  0 = 0

victorapm avatar Feb 16 '24 21:02 victorapm

Testing GPU runs for SPE10_with_burdens_sequential.xml with develop repo 5c88b35c2.

------------------- TIMESTEP START -------------------
    - Time:       00h00m00s (0 s)
    - Delta Time: 01h00m00s (3600 s)
    - Cycle:      0
------------------------------------------------------


Time: 0.00e+00 s, dt: 3600 s, Cycle: 0

  Iteration  1: singlePhaseFlow
    Attempt:  0, ConfigurationIter:  0, NewtonIter:  0
        ( Rflow ) = ( 2.74e+03 )        ( R ) = ( 2.74e+03 )

Simulation is frozen at first time step and does not move forward.

jhuang2601 avatar Feb 20 '24 15:02 jhuang2601

Hey, @jhuang2601 what h/w are you running these on ?

drmichaeltcvx avatar Apr 04 '24 16:04 drmichaeltcvx

for SEAM: looks like there is some issue with those params:

                amgThreshold="0.6"
                amgAggressiveCoarseningLevels="1"

when i removed them - it runs just fine, while before it was stuck at first mechanical solve.

paveltomin avatar Apr 04 '24 16:04 paveltomin

Testing GPU runs for SPE10_with_burdens_sequential.xml with develop repo 5c88b35c2.

------------------- TIMESTEP START -------------------
    - Time:       00h00m00s (0 s)
    - Delta Time: 01h00m00s (3600 s)
    - Cycle:      0
------------------------------------------------------


Time: 0.00e+00 s, dt: 3600 s, Cycle: 0

  Iteration  1: singlePhaseFlow
    Attempt:  0, ConfigurationIter:  0, NewtonIter:  0
        ( Rflow ) = ( 2.74e+03 )        ( R ) = ( 2.74e+03 )

Simulation is frozen at first time step and does not move forward.

I can confirm hanging at random places (re-run multiple time and it will hang at different linear solve) image

paveltomin avatar Apr 04 '24 16:04 paveltomin