Luke Conibear
Luke Conibear
@CharlelieLrt this is not using old checkpoints. It is all new runs and checkpoints for all steps. Yes, I used that exact config for generation. I used the default configs...
@CharlelieLrt Thanks for the quick response. Unfortunately, yes the generation `RuntimeError` issue is still there. --- For the timing comment, I was confused in my comparisons. Sorry for wasting time...
Thanks for your help - I used this recent [commit](https://github.com/NVIDIA/physicsnemo/commit/7c6c912cc0e2a1601e48e0aa89a3281b3276ff6d) for all steps - Commands ```bash # Regression torchrun --standalone --nnodes=1 --nproc_per_node=2 train.py --config-name=config_training_hrrr_mini_regression.yaml model=regression ++dataset.data_path=${{inputs.data_path}} ++dataset.stats_path=${{inputs.stats_path}} ++training.hp.total_batch_size=2560 ++training.hp.batch_size_per_gpu=640 ++training.perf.dataloader_workers=1...
The above information is for non-patched diffusion, as I cannot get the patched version to work. I've tried many config/hydra variants e.g., appending to the command ```python f"model=patched_diffusion ++training.hp.patch_shape_x={patch_shape_x} ++training.hp.patch_shape_y={patch_shape_y}...
@CharlelieLrt Okay, great, thanks a lot for the help. Yes, you're right about the patch shape. I used 32 and patched diffusion works. Then generation for patched diffusion has the...
@CharlelieLrt Thanks a lot for the great help here. I confirm this is fixed.