impactx icon indicating copy to clipboard operation
impactx copied to clipboard

running multiple simulations in one script with separate monitor output from each

Open egstern opened this issue 7 months ago • 2 comments

What I would like to is run two different configurations of the same lattice and produce two different monitorxxx.h5 files in one run so I can compare them easily If I run using two sim objects, each one creates a monitor element with a different name, then the files seem to be written correctly but I get a lot of hdf5 errors when the job ends. If I serially run two separate sim jobs one after another each one writing to a different monitor file, then I get directories diags.old.nnnnnn/openPMD with results from the first simulation and diags/openPMD with results from the second simulation but no error messages.

egstern avatar Apr 16 '25 19:04 egstern

Example script

egstern avatar Apr 17 '25 15:04 egstern

Running the example script:

(impactx-cpu-mpich-dev) [egstern@WL-146057 multi_sim]$ python multi_sim_inline.py
Initializing AMReX (25.02)...
MPI initialized with 1 MPI processes
MPI initialized with thread support level 3
OMP initialized with 6 OMP threads
AMReX (25.02) initialized

Grids Summary:
  Level 0   1 grids  512 cells  100 % of domain

created monitor for nslice:1, id: 140345410478768

Grids Summary:
  Level 0   1 grids  512 cells  100 % of domain

created monitor for nslice:2, id: 140345410670000

Grids Summary:
  Level 0   1 grids  512 cells  100 % of domain

created monitor for nslice:4, id: 140345489750320

Grids Summary:
  Level 0   1 grids  512 cells  100 % of domain

created monitor for nslice:8, id: 140345489740592

Grids Summary:
  Level 0   1 grids  512 cells  100 % of domain

created monitor for nslice:16, id: 140345489327664
 Diagnostics: 1
 Space Charge effects: False
 CSR effects: 0
 ++++ Starting step=1 slice_step=0



**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ FIRST STEP ]
*
* No recorded warnings.
********************************************************************************

 ++++ Starting step=2 slice_step=0

 ++++ Starting step=3 slice_step=0

 Diagnostics: 1
 Space Charge effects: False
 CSR effects: 0
 ++++ Starting step=1 slice_step=0



**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ FIRST STEP ]
*
* No recorded warnings.
********************************************************************************

 ++++ Starting step=2 slice_step=0

 ++++ Starting step=3 slice_step=1

 ++++ Starting step=4 slice_step=0

 Diagnostics: 1
 Space Charge effects: False
 CSR effects: 0
 ++++ Starting step=1 slice_step=0



**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ FIRST STEP ]
*
* No recorded warnings.
********************************************************************************

 ++++ Starting step=2 slice_step=0

 ++++ Starting step=3 slice_step=1

 ++++ Starting step=4 slice_step=2

 ++++ Starting step=5 slice_step=3

 ++++ Starting step=6 slice_step=0

 Diagnostics: 1
 Space Charge effects: False
 CSR effects: 0
 ++++ Starting step=1 slice_step=0



**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ FIRST STEP ]
*
* No recorded warnings.
********************************************************************************

 ++++ Starting step=2 slice_step=0

 ++++ Starting step=3 slice_step=1

 ++++ Starting step=4 slice_step=2

 ++++ Starting step=5 slice_step=3

 ++++ Starting step=6 slice_step=4

 ++++ Starting step=7 slice_step=5

 ++++ Starting step=8 slice_step=6

 ++++ Starting step=9 slice_step=7

 ++++ Starting step=10 slice_step=0

 Diagnostics: 1
 Space Charge effects: False
 CSR effects: 0
 ++++ Starting step=1 slice_step=0



**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ FIRST STEP ]
*
* No recorded warnings.
********************************************************************************

 ++++ Starting step=2 slice_step=0

 ++++ Starting step=3 slice_step=1

 ++++ Starting step=4 slice_step=2

 ++++ Starting step=5 slice_step=3

 ++++ Starting step=6 slice_step=4

 ++++ Starting step=7 slice_step=5

 ++++ Starting step=8 slice_step=6

 ++++ Starting step=9 slice_step=7

 ++++ Starting step=10 slice_step=8

 ++++ Starting step=11 slice_step=9

 ++++ Starting step=12 slice_step=10

 ++++ Starting step=13 slice_step=11

 ++++ Starting step=14 slice_step=12

 ++++ Starting step=15 slice_step=13

 ++++ Starting step=16 slice_step=14

 ++++ Starting step=17 slice_step=15

 ++++ Starting step=18 slice_step=0



TinyProfiler total time across processes [min...avg...max]: 0.09809 ... 0.09809 ... 0.09809

-------------------------------------------------------------------------------------------------------
Name                                                    NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
-------------------------------------------------------------------------------------------------------
impactx::Push::BeamMonitor                                  10    0.03497    0.03497    0.03497  35.66%
DistributionMapping::LeastUsedCPUs()                         5   0.005877   0.005877   0.005877   5.99%
ImpactX::init_grids                                          4   0.004216   0.004216   0.004216   4.30%
ImpactX::AddNParticles                                       5   0.004048   0.004048   0.004048   4.13%
impactx::Push::ExactSbend                                   31   0.003366   0.003366   0.003366   3.43%
impactx::diagnostics::reduced_beam_characteristics(pc)      20    0.00244    0.00244    0.00244   2.49%
ImpactX::evolve::slice_step                                 41  0.0007012  0.0007012  0.0007012   0.71%
ImpactX::track_particles                                     5   0.000487   0.000487   0.000487   0.50%
impactx::diagnostics::DiagnosticOutput(pc)                  20  0.0004609  0.0004609  0.0004609   0.47%
ImpactX::validate                                            5  0.0004432  0.0004432  0.0004432   0.45%
impactx::Push                                               41  2.178e-05  2.178e-05  2.178e-05   0.02%
DistributionMapping::SFCProcessorMapDoIt()                   5  1.406e-05  1.406e-05  1.406e-05   0.01%
AmrMesh::MakeDistributionMap()                               5  1.249e-05  1.249e-05  1.249e-05   0.01%
Other                                                      194  0.0007031  0.0007031  0.0007031   0.72%
-------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------
Name                                                    NCalls  Incl. Min  Incl. Avg  Incl. Max   Max %
-------------------------------------------------------------------------------------------------------
ImpactX::track_particles                                     5    0.04359    0.04359    0.04359  44.44%
ImpactX::evolve::slice_step                                 41    0.04078    0.04078    0.04078  41.58%
impactx::Push                                               41    0.03953    0.03953    0.03953  40.30%
impactx::Push::BeamMonitor                                  10    0.03611    0.03611    0.03611  36.81%
AmrMesh::MakeDistributionMap()                               5   0.005909   0.005909   0.005909   6.02%
DistributionMapping::SFCProcessorMapDoIt()                   5   0.005896   0.005896   0.005896   6.01%
DistributionMapping::LeastUsedCPUs()                         5   0.005877   0.005877   0.005877   5.99%
ImpactX::init_grids                                          4   0.004427   0.004427   0.004427   4.51%
ImpactX::AddNParticles                                       5   0.004048   0.004048   0.004048   4.13%
impactx::Push::ExactSbend                                   31   0.003401   0.003401   0.003401   3.47%
impactx::diagnostics::reduced_beam_characteristics(pc)      20    0.00244    0.00244    0.00244   2.49%
impactx::diagnostics::DiagnosticOutput(pc)                  20    0.00188    0.00188    0.00188   1.92%
ImpactX::validate                                            5  0.0004432  0.0004432  0.0004432   0.45%
Other                                                      194  0.0008104  0.0008104  0.0008104   0.83%
-------------------------------------------------------------------------------------------------------

Pinned Memory Usage:
-----------------------------------------------------------------
Name                             Nalloc  Nfree   AvgMem    MaxMem
-----------------------------------------------------------------
The_Pinned_Arena::Initialize()        1      1  133 KiB  8192 KiB
ParticleContainer::addParticles     540    540  504   B  1440   B
ImpactX::early_param_check            5      5    0   B    32   B
ImpactX::init_grids                  14     14    0   B    32   B
ImpactX::track_particles             14     14    0   B    32   B
Unprofiled                            8      8    0   B    32   B
-----------------------------------------------------------------

Cpu Memory Usage:
----------------------------------------------------
Name                 Nalloc  Nfree   AvgMem   MaxMem
----------------------------------------------------
ImpactX::init_grids      25      5  263 KiB  475 KiB
----------------------------------------------------

AMReX (25.02) finalized
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5F.c line 1060 in H5Fclose(): decrementing file ID failed
    major: File accessibility
    minor: Unable to close file
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
Internal error: Failed to close HDF5 file (parallel)
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close bool enum
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex float type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5P.c line 1489 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
[HDF5] Internal error: Failed to close HDF5 dataset transfer property
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5P.c line 1489 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
[HDF5] Internal error: Failed to close HDF5 file access property
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5F.c line 1060 in H5Fclose(): decrementing file ID failed
    major: File accessibility
    minor: Unable to close file
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
Internal error: Failed to close HDF5 file (parallel)
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close bool enum
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex float type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5P.c line 1489 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
[HDF5] Internal error: Failed to close HDF5 dataset transfer property
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5P.c line 1489 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
[HDF5] Internal error: Failed to close HDF5 file access property
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5F.c line 1060 in H5Fclose(): decrementing file ID failed
    major: File accessibility
    minor: Unable to close file
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
Internal error: Failed to close HDF5 file (parallel)
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close bool enum
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex float type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5P.c line 1489 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
[HDF5] Internal error: Failed to close HDF5 dataset transfer property
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5P.c line 1489 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
[HDF5] Internal error: Failed to close HDF5 file access property
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5F.c line 1060 in H5Fclose(): decrementing file ID failed
    major: File accessibility
    minor: Unable to close file
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
Internal error: Failed to close HDF5 file (parallel)
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close bool enum
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex float type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5T.c line 2055 in H5Tclose(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
[HDF5] Internal error: Failed to close complex long double type
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5P.c line 1489 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
[HDF5] Internal error: Failed to close HDF5 dataset transfer property
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) thread 0:
  #000: H5P.c line 1489 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: H5Iint.c line 1090 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: H5Iint.c line 1045 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: H5Iint.c line 951 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
[HDF5] Internal error: Failed to close HDF5 file access property
(impactx-cpu-mpich-dev) [egstern@WL-146057 multi_sim]$

egstern avatar Apr 17 '25 15:04 egstern