modulus icon indicating copy to clipboard operation
modulus copied to clipboard

🐛[BUG]: CorrDiff timing crash with small data iterator

Open jleinonen opened this issue 7 months ago • 0 comments

Version

Latest from source

On which installation method(s) does this occur?

Source

Describe the issue

In CorrDiff generate.py there seems to be a bug on these lines: https://github.com/NVIDIA/modulus/blob/15ea3c999679fbe99f554aa27a2967a1dc5c1fb1/examples/generative/corrdiff/generate.py#L369-L417

If the data iterator created at https://github.com/NVIDIA/modulus/blob/15ea3c999679fbe99f554aa27a2967a1dc5c1fb1/examples/generative/corrdiff/generate.py#L379 has fewer than warmup_steps + 1 items, the timing start at https://github.com/NVIDIA/modulus/blob/15ea3c999679fbe99f554aa27a2967a1dc5c1fb1/examples/generative/corrdiff/generate.py#L384-L385 will never execute, and elapsed_time at https://github.com/NVIDIA/modulus/blob/15ea3c999679fbe99f554aa27a2967a1dc5c1fb1/examples/generative/corrdiff/generate.py#L417 causes a crash.

Minimum reproducible example

No response

Relevant log output

Traceback (most recent call last):
File "<corrdiff_path>/examples/generative/corrdiff/generate.py", line 308, in main
generate_and_save (
File "<corrdiff_path>/examples/generative/corrdiff/generate.py", line 417, in generate_and_save
elapsed_time = start.elapsed_time(end) / 1000.0 # Convert ms to s
File "/home/ubuntu/python3.10_env_modulus_latest/lib/python3.10/site-packages/torch/cuda/streams.py", line 213, in elapsed_time
return super().elapsed_time(end_event)
RuntimeError: Both events must be recorded before calculating elapsed time.

Environment details

No response

jleinonen avatar Jul 10 '24 16:07 jleinonen