picongpu icon indicating copy to clipboard operation
picongpu copied to clipboard

hemera v100: openPMD-api output to hdf5 - simulation hangs

Open PrometheusPi opened this issue 3 years ago • 2 comments

Running the default Laser Wakefield example on hemera V100 GPUs using the h5 backend of openPMD-api instead of the bp backend, leads to a hanging simulation. I could run the simulation with bp without any problems. Switching to h5 resulted in a hanging simulation right after init.
The first h5 output file was written but never closed.

PrometheusPi avatar Oct 04 '22 14:10 PrometheusPi

I met this issue too, and the h5 output works fine only when the code was run in one gpu. The parallel output for hdf5 seems incorrct. I still can't find the solution.

zwjlpi avatar Jun 08 '24 18:06 zwjlpi

Often the problem is coming from broken chunking in HDF5.

This could be a solution: https://github.com/ComputationalRadiationPhysics/picongpu/issues/4845#issuecomment-2009408453

psychocoderHPC avatar Jun 10 '24 11:06 psychocoderHPC

Well, in the past I notet that it does not hang, but actually writes very very very [many more 'very'] slowly. That's why I used bp. Which was the solution for me.

steindev avatar Mar 21 '25 16:03 steindev