mala icon indicating copy to clipboard operation
mala copied to clipboard

Remove hardcoded iteration number from data shuffler

Open franzpoeschel opened this issue 2 years ago • 2 comments

Until now, the DataShuffler replaces the * pattern with the snapshot number before passing the file name to openPMD. It's better to set %T here so openPMD knows the structure of the filename. With this fix, the output data can be accessed as one data series with openPMD tooling, e.g.

> openpmd-ls Be_shuffled%T.in.h5
openPMD series: Be_shuffled%T.in
openPMD standard: 1.1.0
openPMD extensions: 0

data author: ...
data created: 2024-03-11 11:31:55 +0100
data backend: HDF5
generating machine: unknown
generating software: MALA (version: 1.2.1)
generating software dependencies: unknown

number of iterations: 2 (fileBased)
  all iterations: 0 1 

number of meshes: 1
  all meshes:
    Bispectrum

number of particle species: 0
franzpoeschel:~/git-repos/mala/examples/advanced
> openpmd-ls Be_shuffled%T.out.h5
openPMD series: Be_shuffled%T.out
openPMD standard: 1.1.0
openPMD extensions: 0

data author: ...
data created: 2024-03-11 11:31:55 +0100
data backend: HDF5
generating machine: unknown
generating software: MALA (version: 1.2.1)
generating software dependencies: unknown

number of iterations: 2 (fileBased)
  all iterations: 0 1 

number of meshes: 1
  all meshes:
    LDOS

number of particle species: 0

franzpoeschel avatar Mar 11 '24 10:03 franzpoeschel

The Be_snapshot files in the test-data repository are affected by the same issue:

> openpmd-ls Be_snapshot%T.out.h5
openPMD series: Be_snapshot%T.out
openPMD standard: 1.1.0
openPMD extensions: 0

data author: ...
data created: 2023-05-23 15:13:58 +0200
data backend: HDF5
generating machine: unknown
generating software: MALA (version: 1.1.0)
generating software dependencies: unknown

number of iterations: 4 (groupBased)
  all iterations: An error occurred while opening the specified openPMD series!
Internal error: Group/Variable-based encoding: Parse preference must be set.
This is a bug. Please report at ' https://github.com/openPMD/openPMD-api/issues'.

> openpmd-ls Be_snapshot%T.in.h5
openPMD series: Be_snapshot%T.in
openPMD standard: 1.1.0
openPMD extensions: 0

data author: ...
data created: 2023-05-23 15:13:45 +0200
data backend: HDF5
generating machine: unknown
generating software: MALA (version: 1.1.0)
generating software dependencies: unknown

number of iterations: 4 (groupBased)
  all iterations: An error occurred while opening the specified openPMD series!
Internal error: Group/Variable-based encoding: Parse preference must be set.
This is a bug. Please report at ' https://github.com/openPMD/openPMD-api/issues'.

(Yes, openPMD seems to have insufficient error handling here, will need to fix separately.) EDIT: This seems to be fixed already on our dev heh

To fix this, can someone tell me how these files are created in the first place?

franzpoeschel avatar Mar 11 '24 11:03 franzpoeschel

I just saw this PR, do you still need input here?

RandomDefaultUser avatar Apr 04 '24 16:04 RandomDefaultUser

I just saw this PR, do you still need input here?

This is now ready for review

franzpoeschel avatar May 30 '24 09:05 franzpoeschel

Looks good to me, thank you!

RandomDefaultUser avatar May 30 '24 14:05 RandomDefaultUser