modulus
modulus copied to clipboard
🐛[BUG]: GraphCast example with syntetic dataset requires ERA5 metadata
Version
0.7.0
On which installation method(s) does this occur?
Docker
Describe the issue
Hi,
I started a new project with the GraphCast example. I wanted to test it on the synthetic dataset before downloading the ERA5 data, but it turned out that the loss.py
requires missing metadata from ERA5.
I installed Modulus from docker (24.07), installed missing mlflow
with pip, changed the num_samples_per_year_train
to 1 to fit into the memory, and started training with the synthetic dataset python train_graphcast.py synthetic_dataset=true
.
It asks for metadata from the ERA5 dataset.
Minimum reproducible example
python train_graphcast.py synthetic_dataset=true
### Relevant log output
```shell
root@be1b9fafbee5:/data/codes/modulus/examples/weather/graphcast# python train_graphcast.py synthetic_dataset=true
/usr/local/lib/python3.10/dist-packages/modulus/distributed/manager.py:346: UserWarning: Could not initialize using ENV, SLURM or OPENMPI methods. Assuming this is a single process job
warn(
[11:10:46 - main - INFO] Rank: 0, Device: cuda:0
[11:10:46 - main - WARNING] Using Dummy dataset. Ignoring static dataset, cosine zenith angle, time of the year, and history. Also setting num_workers to 0.
[11:10:47 - main - INFO] Using torch.bfloat16 dtype
[11:10:47 - main - WARNING] Static dataset path is not provided. Setting num_channels_static to 0.
[11:10:57 - main - INFO] Model parameter count is 35296329
Generated synthetic temperature data in 4.07 seconds.
[11:11:02 - main - INFO] Loaded training datapipe of size 0
Error executing job with overrides: ['synthetic_dataset=true']
Traceback (most recent call last):
File "/data/codes/modulus/examples/weather/graphcast/train_graphcast.py", line 349, in main
trainer = GraphCastTrainer(cfg, dist, rank_zero_logger)
File "/data/codes/modulus/examples/weather/graphcast/train_graphcast.py", line 211, in __init__
self.criterion = GraphCastLossFunction(
File "/usr/local/lib/python3.10/dist-packages/modulus/utils/graphcast/loss.py", line 129, in __init__
self.channel_dict = self.get_channel_dict(dataset_metadata_path, channels_list)
File "/usr/local/lib/python3.10/dist-packages/modulus/utils/graphcast/loss.py", line 173, in get_channel_dict
with open(dataset_metadata_path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/era5_75var/metadata/data.json'
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Environment details
docker run --gpus all --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --runtime nvidia --rm -it nvcr.io/nvidia/modulus/modulus:xx.xx bash
pip install mlflow