PDEBench icon indicating copy to clipboard operation
PDEBench copied to clipboard

1D AdvectionEq Dataset dimensions

Open qwerfdsadad opened this issue 1 year ago • 1 comments

Hello! This work of yours about PDEBench has been a strong support to drive innovation in machine learning simulation and I thank you for your contribution. I'm recently studying your project code.

I went to the /data_gen_NLE/AdvectionEq folder to generate the 1D AdvectionEq Dataset, and use Data_Merge.py to convert data format to hdf5.

CUDA_VISIBLE_DEVICES='2,3' python3 advection_multi_solution_Hydra.py +multi=beta1e0.yaml

The config file to generate 11D_Advection_Sols_beta1.0.hdf5 files is:

save: '/save/advection/'
dt_save: 0.01
ini_time: 0.
fin_time: 2.
nx: 1024
xL: 0.
xR: 1.
beta : 1.e0
if_show: 1
numbers: 10
CFL: 4.e-1
if_second_order: 1.
show_steps: 100
init_key: 2022
if_rand_param: False

Using the generated data for testing,

CUDA_VISIBLE_DEVICES='3' python3 train_models_forward.py +args=config_Adv.yaml ++args.filename='1D_Advection_Sols_beta1.0.hdf5' ++args.model_name='FNO'

a problem occurred.

I checked the dimensions of the data and it is [1,10,201,1024], so models/fno/utils.py determines its shape is 4, which means it's a 2D datasets. image

Then, comes the problem.

FNO
Epochs = 5, learning rate = 0.001, scheduler step = 100, scheduler gamma = 0.5
FNODatasetSingle
Error executing job with overrides: ['+args=config_Adv.yaml', '++args.filename=1D_Advection_Sols_beta2.0.hdf5', '++args.model_name=FNO']
Traceback (most recent call last):
  File "/home/wm/PDEBench/pdebench/models/train_models_forward.py", line 253, in <module>
    main()
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/main.py", line 94, in decorated_main
    _run_hydra(
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/_internal/utils.py", line 457, in _run_app
    run_and_report(
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
    raise ex
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "/home/wm/PDEBench/pdebench/models/train_models_forward.py", line 166, in main
    run_training_FNO(
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/pdebench/models/fno/train.py", line 65, in run_training
    train_data = FNODatasetSingle(flnm,
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/pdebench/models/fno/utils.py", line 326, in __init__
    _data = np.array(f['nu'], dtype=np.float32)  # batch, time, x,...
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/wm/miniconda3/envs/pdebench/lib/python3.9/site-packages/h5py/_hl/group.py", line 357, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 189, in h5py.h5o.open
KeyError: "Unable to synchronously open object (object 'nu' doesn't exist)"

Yet, I used the dataset data/1D/Advection/Train/1D_Advection_Sols_beta7.0.hdf5 downloaded from the /pdebench/data_download/ directory for testing. It's shape is [10000,201,1024]. The problem disappeared.

So, which parameter in the configuration file determines the dimensions of the dataset?

qwerfdsadad avatar Mar 08 '24 14:03 qwerfdsadad

https://github.com/pdebench/PDEBench/issues/56#issuecomment-1997459527

mtakamoto-D avatar Mar 14 '24 13:03 mtakamoto-D

As you've noticed, the JAX simulator places a leading singleton dimension, which is not squeezed out in the serialization code in Data_Merge. So, you can modify the script, specifically the line:

Data_Merge.py#L333 to

f.create_dataset('tensor', data = _data.astype(np.float32).squeeze())

And that should fix the above issue.

kmario23 avatar May 14 '24 18:05 kmario23

Fixed in https://github.com/pdebench/PDEBench/commit/7acb9945dfb7d714e2cfd1b7ec3eb2db2c20e6c8

kmario23 avatar May 14 '24 18:05 kmario23

This has been finally fixed in commit https://github.com/pdebench/PDEBench/commit/b470a487fc37644c79d1c34d632d3360eddee357.

To generate 1D Advection dataset, try the following:

# generate data and save as .npy array
$ cd PDEBench/pdebench/data_gen/data_gen_NLE/AdvectionEq
$ CUDA_VISIBLE_DEVICES='2,3' python3 advection_multi_solution_Hydra.py +multi=beta1e0.yaml

# serialize to hdf5 by transforming npy file
$ cd ..
$ python Data_Merge.py

kmario23 avatar Oct 02 '24 21:10 kmario23