nwchem icon indicating copy to clipboard operation
nwchem copied to clipboard

Restarting TCE calculation

Open hernan3009 opened this issue 5 years ago • 7 comments

Dear Developers,

I cannot succeed in restarting TCE calculations. In particular, I am trying to learn how to restart CCSDT(2)_Q calculations with NWChem 6.8 in a single PC. I really need this feature due to very frequents power outage in my city.

After reading the documentation I made many attempts according to the following logic

For the first run:

#set tce:read_integrals T
#set tce:read_t  T
#set tce:read_l  T
#set tce:read_tr T

set tce:save_integrals T
set tce:save_t  T
set tce:save_l  T
set tce:save_tr T

For the subsequent runs I uncommented selectively the various reading parts trying to activate the reading of only those parts that were computed previously. Also, I tried by commenting the 'save' directives in the last runs.

I also tried using set tce:writeint t, set tce:writet t, set tce:readt t set tce:readint t mixing them with the previous mentioned options, but I do not know how to use them. I found them in some post but not in the documentation.

The only part that I can manage to restart is the one of the electron integrals and only when I use IO GA option in TCE. But I cannot save anything if using IO SF. With IO GA I can save files for restart, e.g.

CCSDT iterations
 --------------------------------------------------------
 Iter          Residuum       Correlation     Cpu    Wall
 --------------------------------------------------------
 *************** Warning ***************
  Create file size is zero. Calculation
  will continue by increasing the size.
  Use of a larger basis set is advised.
    1   0.3782968689333  -0.3252459704851     4.7     4.7
 Saving T1 now...
      x1_restart_save filename: ./prueba.t1_copy                                                                
                                                         x1_restart_save finished
    2   0.0852896027122  -0.3322396786580     4.7     4.7
 Saving T1 now...
      x1_restart_save filename: ./prueba.t1_copy                                                                
                                                         x1_restart_save finished
    3   0.0609881907860  -0.3460606144011     4.6     4.7

but I cannot use them for restarting. Similar run with IO SF returns

$ mpirun -np 2 nwchem input.nw > salida.out 

[0] ARMCI Error: 0:ngai_get_common:ngai_get_common: INVALID ARRAY HANDLE:
[1] ARMCI Error: 1:ngai_get_common:ngai_get_common: INVALID ARRAY HANDLE:
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI COMMUNICATOR 3 DUP FROM 0
with errorcode -2997.

How to restart the calculation?

Thanks in advance

hernan3009 avatar Sep 11 '19 15:09 hernan3009

Please provide (attach) complete input files for the two runs

edoapra avatar Sep 11 '19 20:09 edoapra

Thank you for answer @edoapra . Here are the outputs (which includes inputs):

Using default IO GA and forcing to stop the calculation by killing the process. Input/Output_00

Running another run (with ) trying to restart the previous run: Input/Output_01

Using IO SF fails with the following message in the Linux terminal (the files from previous calculations were removed). Input/Output_02

[1] ARMCI Error: 1:ngai_get_common:ngai_get_common: INVALID ARRAY HANDLE:
[0] ARMCI Error: 0:ngai_get_common:ngai_get_common: INVALID ARRAY HANDLE:
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI COMMUNICATOR 3 DUP FROM 0
with errorcode -2997.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[my-desktop:20197] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[my-desktop:20197] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

Notice that this particular system/basis set is just for learning how to restart calculations.

hernan3009 avatar Sep 11 '19 22:09 hernan3009

Deleted.

xiongyan21 avatar Sep 19 '19 14:09 xiongyan21

@hernan3009 bold face is almost certainly an artifact of markdown parsing asterisks as bold face.

jeffhammond avatar Sep 19 '19 16:09 jeffhammond

This suggests something is wrong...

  Create file size is zero. Calculation
  will continue by increasing the size.
  Use of a larger basis set is advised.

jeffhammond avatar Sep 19 '19 16:09 jeffhammond

@jeffhammond I understand

hernan3009 avatar Sep 19 '19 16:09 hernan3009

Deleted.

xiongyan21 avatar Sep 20 '19 02:09 xiongyan21