physicsnemo icon indicating copy to clipboard operation
physicsnemo copied to clipboard

⛰️[EPIC]: VFGN Integration

Open mnabian opened this issue 1 year ago • 2 comments

Tracker for the issues related to integrating VFGN

### Pytest
- [ ] https://github.com/NVIDIA/modulus/issues/451

mnabian avatar Feb 08 '24 00:02 mnabian

current_errors

Excluding entry point one, there are 3 more errors.

  1. Checkpointing is not compatible with .grad() or when an inputs parameter is passed to .backward().

This one is because our code includes torch.util.checkpoint function, which does not support torch.grad. It only supports backward()

  1. RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

This is mainy because VFGN have multiple inputs including scalar integers.

  1. TypeError: Object of type Tensor is not JSON serializable

I have no idea what causing this error.

jl626 avatar Feb 09 '24 23:02 jl626

TypeError: Object of type Tensor is not JSON serializable

seems in checkpooints.py save function does not take type TorchTensor or a predefined class as input, class LearnedSimulator() initialization does create these tensors in arg "boundaries" and "normalization_stats"

First check fail safes of save/load functions

try:
    model_1.save("folder_does_not_exist/checkpoint.mdlus")
except IOError:
    pass

dearleiii avatar Feb 13 '24 08:02 dearleiii

https://github.com/NVIDIA/modulus/pull/334

mnabian avatar Aug 06 '24 19:08 mnabian