physicsnemo icon indicating copy to clipboard operation
physicsnemo copied to clipboard

🐛[BUG]: GraphCast example fails with unexpected keyword argument 'mesh_level'

Open benkirk opened this issue 1 year ago • 1 comments

Version

0.7.0

On which installation method(s) does this occur?

Docker

Describe the issue

I'm attempting to run the examples/weather/graphcast example with nvidia-modulus 0.7.0 running under the NGC 24.07 image, and recieve an unexpected argument error:

Apptainer> python train_graphcast.py wb_mode=disabled synthetic_dataset=true
/usr/local/lib/python3.10/dist-packages/modulus/distributed/manager.py:346: UserWarning: Could not initialize using ENV, SLURM or OPENMPI methods. Assuming this is a single process job
  warn(
[08:38:48 - main - INFO] Rank: 0, Device: cuda:0
[08:38:48 - main - WARNING] Using synthetic dataset. Ignoring static dataset, cosine zenith angle, time of the year, and history. Also setting num_workers to 0.
[08:38:48 - main - INFO] Using torch.bfloat16 dtype
[08:38:48 - main - WARNING] Static dataset path is not provided. Setting num_channels_static to 0.
Error executing job with overrides: ['wb_mode=disabled', 'synthetic_dataset=true']
Traceback (most recent call last):
  File "/glade/work/benkirk/repos/csg-utils/hpc-demos/containers/AI_ML/NGC/apptainer/modulus/modulus/examples/weather/graphcast/train_graphcast.py", line 363, in main
    trainer = GraphCastTrainer(cfg, dist, rank_zero_logger)
  File "/glade/work/benkirk/repos/csg-utils/hpc-demos/containers/AI_ML/NGC/apptainer/modulus/modulus/examples/weather/graphcast/train_graphcast.py", line 105, in __init__
    self.model = GraphCastNet(
  File "/usr/local/lib/python3.10/dist-packages/modulus/models/module.py", line 65, in __new__
    bound_args = sig.bind_partial(
  File "/usr/lib/python3.10/inspect.py", line 3193, in bind_partial
    return self._bind(args, kwargs, partial=True)
  File "/usr/lib/python3.10/inspect.py", line 3175, in _bind
    raise TypeError(
TypeError: got an unexpected keyword argument 'mesh_level'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

(The same issue occurs via pip install.)

Any help would be appreciated!

Minimum reproducible example

python train_graphcast.py wb_mode=disabled synthetic_dataset=true

Relevant log output

No response

Environment details

No response

benkirk avatar Sep 12 '24 14:09 benkirk

I can verify the previous commit bccede01c7c42ad5dadde81c3ad349b6814812bb at least starts running.

benkirk avatar Sep 12 '24 14:09 benkirk

Hi @benkirk , thanks for reporting the issue. As you rightly mentioned, this has been fixed in the latest GraphCast commits. Can we close this issue?

mnabian avatar Oct 17 '24 01:10 mnabian