learnopencv icon indicating copy to clipboard operation
learnopencv copied to clipboard

VAE_Cartoon_Tensorflow Training

Open ReidTPowell opened this issue 1 year ago • 4 comments

training VAE fails with the following error:

UnboundLocalError: local variable 'kl_loss' referenced before assignment

Any thoughts or suggestion on how to resolve this error would be greatly appreciated.

ReidTPowell avatar May 21 '23 22:05 ReidTPowell

Apparently the loss variable is being accessed in code before it's getting initialised

RohitDhankar avatar May 22 '23 02:05 RohitDhankar

Thanks for the prompt response, and sorry for the vague description of the error. The full trace back is:


UnboundLocalError Traceback (most recent call last) Cell In[36], line 1 ----> 1 train(normalized_ds, 30)

Cell In[34], line 8, in train(dataset, epochs) 6 for image_batch in dataset: 7 i += 1 ----> 8 loss = train_step(image_batch) 9 #loss_.append(loss) 10 11 #print("Loss",np.mean(loss_)) 12 seed = image_batch[:25]

File C:\Anaconda3\envs\tensorflow29\lib\site-packages\tensorflow\python\util\traceback_utils.py:153, in filter_traceback..error_handler(*args, **kwargs) 151 except Exception as e: 152 filtered_tb = _process_traceback_frames(e.traceback) --> 153 raise e.with_traceback(filtered_tb) from None 154 finally: 155 del filtered_tb

File ~\AppData\Local\Temp_autograph_generated_file3_wyyupw.py:14, in outer_factory..inner_factory..tf__train_step(images) 12 latent = ag_.converted_call(ag__.ld(final), ([ag__.ld(mean), ag__.ld(log_var)],), None, fscope) 13 generated_images = ag__.converted_call(ag__.ld(dec), (ag__.ld(latent),), dict(training=True), fscope) ---> 14 loss = ag__.converted_call(ag__.ld(vae_loss), (ag__.ld(images), ag__.ld(generated_images), ag__.ld(mean), ag__.ld(log_var)), None, fscope) 15 gradients_of_enc = ag__.converted_call(ag__.ld(encoder).gradient, (ag__.ld(loss), ag__.ld(enc).trainable_variables), None, fscope) 16 gradients_of_dec = ag__.converted_call(ag__.ld(decoder).gradient, (ag__.ld(loss), ag__.ld(dec).trainable_variables), None, fscope)

File ~\AppData\Local\Temp_autograph_generated_filem_niztwx.py:11, in outer_factory..inner_factory..tf__vae_loss(y_true, y_pred, mean, var) 9 retval = ag__.UndefinedReturnValue() 10 r_loss = ag__.converted_call(ag__.ld(mse_loss), (ag__.ld(y_true), ag__.ld(y_pred)), None, fscope) ---> 11 kl_loss = ag__.converted_call(ag__.ld(kl_loss), (ag__.ld(mean), ag__.ld(log_var)), None, fscope) 12 try: 13 do_return = True

UnboundLocalError: in user code:

File "C:\Users\rpowell\AppData\Local\Temp\ipykernel_793660\144731484.py", line 11, in train_step  *
    loss = vae_loss(images, generated_images, mean, log_var)
File "C:\Users\rpowell\AppData\Local\Temp\ipykernel_793660\2804652454.py", line 11, in vae_loss  *
    kl_loss = kl_loss(mean, log_var)

UnboundLocalError: local variable 'kl_loss' referenced before assignment

I did run the following block before training: def mse_loss(y_true, y_pred): r_loss = K.mean(K.square(y_true - y_pred), axis = [1,2,3]) return 1000 * r_loss

def kl_loss(mean, log_var): kl_loss = -0.5 * K.sum(1 + log_var - K.square(mean) - K.exp(log_var), axis = 1) return kl_loss

def vae_loss(y_true, y_pred, mean, var): r_loss = mse_loss(y_true, y_pred) kl_loss = kl_loss(mean, log_var) return r_loss + kl_loss

and I could comment out the k1_loss in the vae_loss function to get it to run. So I am thinking something is wrong with the function, but I am admittingly niave in this space.

Reid T. Powell, PhD | Research Assistant Professor Center for Translational Cancer Research, Institute of Biosciences & Technology | Texas A&M University 2121 W. Holcombe Blvd. Rm 911 | Houston, TX 77030 ph: 713.677.7474 | fax: 713.677.7474 | @.@.>


From: Rohit Dhankar @.> Sent: Sunday, May 21, 2023 9:38 PM To: spmallick/learnopencv @.> Cc: Powell, Reid T @.>; Author @.> Subject: Re: [spmallick/learnopencv] VAE_Cartoon_Tensorflow Training (Issue #822)

Apparently the loss variable is being accessed in code before it's getting initialised — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread. Message ID: <spmallick/learnopencv/issues/822/1556432382@ github. com> ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd

Apparently the loss variable is being accessed in code before it's getting initialised

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/spmallick/learnopencv/issues/822*issuecomment-1556432382__;Iw!!KwNVnqRv!EiPJ-fvYlMTWVLsNlFk_XrS_NgJVV46eqVWGKt5ofG47Rnb_P6pxHIGx_dbkcpGZAswWxaMpG1X8-5C-hIMMyG07MA$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/APOLEEYEAQHMLVYHRYDDPE3XHLGSJANCNFSM6AAAAAAYJUXIPA__;!!KwNVnqRv!EiPJ-fvYlMTWVLsNlFk_XrS_NgJVV46eqVWGKt5ofG47Rnb_P6pxHIGx_dbkcpGZAswWxaMpG1X8-5C-hIMowv-lUQ$. You are receiving this because you authored the thread.Message ID: @.***>

ReidTPowell avatar May 22 '23 13:05 ReidTPowell

Please do correct me if im wrong ... i may end up confusing you then providing a solution

Presuming you are using this code -- VAE_Cartoon_TensorFlow.ipynb

Where are you getting this bit of code ...probably from a TEMP FILE of KERAS or TF.Autograph origin thats letting us know that the Notebook cells you have run have missed a continuous Kernel run ... So while it needs that variable value -- kl_loss , its not getting that within the -- Autograph Code -- https://www.tensorflow.org/api_docs/python/tf/autograph

File ~\AppData\Local\Temp\__autograph_generated_filem_niztwx.py:11, in outer_factory.<locals>.inner_factory.<locals>.tf__vae_loss(y_true, y_pred, mean, var) 9 retval_ = ag__.UndefinedReturnValue() 10 r_loss = ag__.converted_call(ag__.ld(mse_loss), (ag__.ld(y_true), ag__.ld(y_pred)), None, fscope) ---> 11 kl_loss = ag__.converted_call(ag__.ld(kl_loss), (ag__.ld(mean), ag__.ld(log_var)), None, fscope)

RohitDhankar avatar May 22 '23 19:05 RohitDhankar

I think that may be the issue. When installing the requirements tf-nightly-gpu got depreciated so I ended up installing tensorflow==2.6.2. From the documentation in the Autograph Code link it looks like those functions are in a different place after tf2.0…

Reid T. Powell, PhD | Research Assistant Professor Center for Translational Cancer Research, Institute of Biosciences & Technology | Texas A&M University 2121 W. Holcombe Blvd. Rm 911 | Houston, TX 77030 ph: 713.677.7474 | fax: 713.677.7474 | @.@.>

From: Rohit Dhankar @.> Sent: Monday, May 22, 2023 2:13 PM To: spmallick/learnopencv @.> Cc: Powell, Reid T @.>; Author @.> Subject: Re: [spmallick/learnopencv] VAE_Cartoon_Tensorflow Training (Issue #822)

Please do correct me if im wrong .. . i may end up confusing you then providing a solution Presuming you are using this code -- VAE_Cartoon_TensorFlow. ipynb Where are you getting this bit of code .. . probably from a TEMP FILE of KERAS or TF. Autograph ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd

Please do correct me if im wrong ... i may end up confusing you then providing a solution

Presuming you are using this code -- VAE_Cartoon_TensorFlow.ipynbhttps://urldefense.com/v3/__https:/github.com/spmallick/learnopencv/blob/master/Variational-Autoencoder-TensorFlow/VAE_Cartoon_TensorFlow.ipynb__;!!KwNVnqRv!He6-MTe5RtUe6LK_aj3IySmhezX5LEvHky596PEn8VAz_PX7IQjHBUpgbkKF9-yRNT0X_gRjKdLbGLaFwSIW4evyow$

Where are you getting this bit of code ...probably from a TEMP FILE of KERAS or TF.Autograph origin thats letting us know that the Notebook cells you have run have missed a continuous Kernel run ... So while it needs that variable value -- kl_loss , its not getting that within the -- Autograph Code -- https://www.tensorflow.org/api_docs/python/tf/autographhttps://urldefense.com/v3/__https:/www.tensorflow.org/api_docs/python/tf/autograph__;!!KwNVnqRv!He6-MTe5RtUe6LK_aj3IySmhezX5LEvHky596PEn8VAz_PX7IQjHBUpgbkKF9-yRNT0X_gRjKdLbGLaFwSIiSL9_Qw$

File ~\AppData\Local\Temp_autograph_generated_filem_niztwx.py:11, in outer_factory..inner_factory..tf__vae_loss(y_true, y_pred, mean, var) 9 retval = ag__.UndefinedReturnValue() 10 r_loss = ag__.converted_call(ag__.ld(mse_loss), (ag__.ld(y_true), ag__.ld(y_pred)), None, fscope) ---> 11 kl_loss = ag__.converted_call(ag__.ld(kl_loss), (ag__.ld(mean), ag__.ld(log_var)), None, fscope)

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/spmallick/learnopencv/issues/822*issuecomment-1557785172__;Iw!!KwNVnqRv!He6-MTe5RtUe6LK_aj3IySmhezX5LEvHky596PEn8VAz_PX7IQjHBUpgbkKF9-yRNT0X_gRjKdLbGLaFwSI0zOEoNA$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/APOLEE26A6NUVBDDBCTOQW3XHO3C5ANCNFSM6AAAAAAYJUXIPA__;!!KwNVnqRv!He6-MTe5RtUe6LK_aj3IySmhezX5LEvHky596PEn8VAz_PX7IQjHBUpgbkKF9-yRNT0X_gRjKdLbGLaFwSJoByBzdg$. You are receiving this because you authored the thread.Message ID: @.@.>>

ReidTPowell avatar May 23 '23 15:05 ReidTPowell