stable-dreamfusion icon indicating copy to clipboard operation
stable-dreamfusion copied to clipboard

Some problems when i use "-O2"

Open SakuBorder opened this issue 1 year ago • 3 comments

Description

When using "- O2", there is a problem with "NaN or Inf found in input tensor." Especially after updating the code, the first epoch will appear, like this: loss=nan (nan), lr=0.004063: : 1% 1/100 [00:00<00:17, 5.55it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004062: : 2% 2/100 [00:00<00:17, 5.57it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004061: : 3% 3/100 [00:00<00:17, 5.58it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004060: : 4% 4/100 [00:00<00:17, 5.58it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004059: : 5% 5/100 [00:00<00:17, 5.58it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004059: : 6% 6/100 [00:01<00:16, 5.58it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004058: : 7% 7/100 [00:01<00:16, 5.59it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004057: : 8% 8/100 [00:01<00:16, 5.59it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004056: : 9% 9/100 [00:01<00:16, 5.58it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004055: : 10% 10/100 [00:01<00:16, 5.59it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004054: : 11% 11/100 [00:01<00:15, 5.57it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004053: : 12% 12/100 [00:02<00:16, 5.49it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004052: : 13% 13/100 [00:02<00:15, 5.50it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004051: : 14% 14/100 [00:02<00:15, 5.52it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004050: : 15% 15/100 [00:02<00:15, 5.54it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004049: : 16% 16/100 [00:02<00:15, 5.54it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004048: : 17% 17/100 [00:03<00:14, 5.55it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004047: : 18% 18/100 [00:03<00:14, 5.50it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004046: : 19% 19/100 [00:03<00:14, 5.48it/s]NaN or Inf found in input tensor. loss=nan (nan), lr=0.004045: : 20% 20/100 [00:03<00:14, 5.48it/s]NaN or Inf found in input tensor. The generated image is pure black

Steps to Reproduce

  1. Set parameter "O2"

  2. Run this code 'python main.py -O2 --image data/teddy_rgba.png --workspace trial_image --iters 5000'

Expected Behavior

Eliminate the "NaN or Inf found in input tensor" error and obtain the generated 3D results correctly

Environment

Ubuntu 18.04 +torch 2.0.0+cu117

SakuBorder avatar May 12 '23 00:05 SakuBorder

I got this same error when I accidentally messed with cuda and torch versions. Cuda 11.7 in an isolated conda environment works for me.

timsterzizzle avatar May 12 '23 02:05 timsterzizzle

I also found out that switching to the configuration for -O2 from 033fa66fd7e696c089564cb28ae6909e5993f8e7 works as well.

aradhyamathur avatar May 12 '23 20:05 aradhyamathur

I also found out that switching to the configuration for -O2 from 033fa66 works as well.

thinks,i will give it a try

SakuBorder avatar May 16 '23 00:05 SakuBorder