physicsnemo ⛰️[EPIC]: VFGN Integration

Tracker for the issues related to integrating VFGN

### Pytest
- [ ] https://github.com/NVIDIA/modulus/issues/451

Feb 08 '24 00:02 mnabian

current_errors

Excluding entry point one, there are 3 more errors.

Checkpointing is not compatible with .grad() or when an inputs parameter is passed to .backward().

This one is because our code includes torch.util.checkpoint function, which does not support torch.grad. It only supports backward()

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

This is mainy because VFGN have multiple inputs including scalar integers.

TypeError: Object of type Tensor is not JSON serializable

I have no idea what causing this error.

Feb 09 '24 23:02 jl626

TypeError: Object of type Tensor is not JSON serializable

seems in checkpooints.py save function does not take type TorchTensor or a predefined class as input, class LearnedSimulator() initialization does create these tensors in arg "boundaries" and "normalization_stats"

First check fail safes of save/load functions

try:
    model_1.save("folder_does_not_exist/checkpoint.mdlus")
except IOError:
    pass

Feb 13 '24 08:02 dearleiii

https://github.com/NVIDIA/modulus/pull/334

Aug 06 '24 19:08 mnabian