Reinvent
Reinvent copied to clipboard
TypeError: 'NoneType' object is not callable
Hi! I got a trouble when running a transfer learning demo in the built environment.
(only 1 epoch is set for training.) python3 input.py ../../test/transfer_learning_config.json
/home/young/miniconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /opt/conda/conda-bld/pytorch_1607370128159/work/c10/cuda/CUDAFunctions.cpp:100.) return torch._C._cuda_getDeviceCount() > 0 100%|######################################################################################################################| 1/1 [00:01<00:00, 1.14s/it]19:11:46: base_transfer_learning_logger.log_message +26: INFO Collecting data for epoch 1 Exception ignored in: <function LocalTransferLearningLogger.del at 0x7f62e2a863b0> Traceback (most recent call last): File "/mnt/c/Users/YANG/Desktop/xtai/ai/Reinvent-master/running_modes/transfer_learning/logging/local_transfer_learning_logger.py", line 20, in del File "/home/young/miniconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 1033, in close File "/home/young/miniconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 133, in flush File "/home/young/miniconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 106, in flush File "/home/young/miniconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 155, in flush File "/home/young/miniconda3/envs/reinvent.v3.0/lib/python3.7/queue.py", line 89, in join File "/home/young/miniconda3/envs/reinvent.v3.0/lib/python3.7/threading.py", line 289, in wait TypeError: 'NoneType' object is not callable
don't know how to solve it. could you help me please? Thanks a lot!
Hi, it looks like you dont have a CUDA compatible GPU on the configuration you are trying to run REINVENT.
When I ran REINVENT on another computer with a right configuration (with CUDA), I got the same problem. The only difference is the first several line were not printed.
The error happened in the file "/home/young/miniconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", in "self._summary_writer.close()", but I don't know how to fix it.
We have introduced recently a few bugfixes one of which looks relevant to the error you are reporting. Could you please make sure you are running the latest version and let us know if the error persists.
I installed the package today and i get the same error message. Any help will be greatly appreciated.
Could you please share more details. Is it also in Transfer Learning mode? Also, which message exactly, the missing cuda or something different?
Yes, it is during transfer learning using the "Transfer_Learning_Demo" Jupyter Notebook. I train for 10 epochs (as set as default) using the sample .smi file.
This is the error message that I get while running the demo:
8:19:47: base_transfer_learning_logger.log_message +29: INFO Collecting data for epoch 10 Exception ignored in: <function LocalTransferLearningLogger.del at 0x7f55ebeed680> Traceback (most recent call last): File "/home/apg/Desktop/Vertex/2021/Machine_Learning/Reinvent/running_modes/transfer_learning/logging/local_transfer_learning_logger.py", line 20, in del File "/home/apg/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 1033, in close File "/home/apg/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 133, in flush File "/home/apg/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 106, in flush File "/home/apg/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 155, in flush File "/home/apg/anaconda3/envs/reinvent.v3.0/lib/python3.7/queue.py", line 89, in join File "/home/apg/anaconda3/envs/reinvent.v3.0/lib/python3.7/threading.py", line 289, in wait TypeError: 'NoneType' object is not callable
I also encountered a similar problem during running executing Transfer_Learning_Demo_Teachers_Forcing.py.
10:09:03: base_transfer_learning_logger.log_message +29: INFO Using adaptative learning rate decay (gamma=0.8, threshold=0.0001, avg=4) Exception ignored in: <function LocalTransferLearningLogger.del at 0x2b2285710f80> Traceback (most recent call last): File "/home/xyun/Reinvent/running_modes/transfer_learning/logging/local_transfer_learning_logger.py", line 20, in del File "/home/xyun/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 1033, in close File "/home/xyun/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 133, in flush File "/home/xyun/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 106, in flush File "/home/xyun/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 156, in flush File "/home/xyun/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/record_writer.py", line 42, in flush File "/home/xyun/anaconda3/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py", line 230, in flush AttributeError: 'NoneType' object has no attribute 'raise_exception_on_not_ok_status'
I am having similar error. Trying to run a transfer learning demo and it does uses 3 epoches but produces same NonType error. Any suggestion? Thanks.
(ReinventCommunity) jupyter@ai-design-tools-rj:~/ReinventCommunity/REINVENT_sampling_demo$ cat ../REINVENT_transfer_learning_demo/run.err
20:09:52: base_transfer_learning_logger.log_message +29: INFO Using adaptative learning rate decay (gamma=0.8, threshold=0.0001, avg=4)
100%|#######################################| 1341/1341 [01:42<00:00, 13.04it/s]
20:13:01: base_transfer_learning_logger.log_message +29: INFO Collecting data for epoch 1
/opt/conda/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:628: UserWarning: The epoch parameter in scheduler.step() was not necessary and is being deprecated where possible. Please use scheduler.step() to step the scheduler. During the deprecation, if epoch is different from None, the closed form is used instead of the new chainable form, where available. Please open an issue if you are unable to replicate your use case: https://github.com/pytorch/pytorch/issues/new/choose.
warnings.warn(EPOCH_DEPRECATION_WARNING, UserWarning)
100%|#######################################| 1341/1341 [01:42<00:00, 13.15it/s]
20:16:19: base_transfer_learning_logger.log_message +29: INFO Collecting data for epoch 2
100%|#######################################| 1341/1341 [01:42<00:00, 13.07it/s]
20:19:36: base_transfer_learning_logger.log_message +29: INFO Collecting data for epoch 3
Exception ignored in: <function LocalTransferLearningLogger.del at 0x7f515bad6cb0>
Traceback (most recent call last):
File "/home/jupyter/Reinvent/running_modes/transfer_learning/logging/local_transfer_learning_logger.py", line 20, in del
File "/opt/conda/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 1033, in close
File "/opt/conda/envs/reinvent.v3.0/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 133, in flush
File "/opt/conda/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 106, in flush
File "/opt/conda/envs/reinvent.v3.0/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 155, in flush
File "/opt/conda/envs/reinvent.v3.0/lib/python3.7/queue.py", line 89, in join
File "/opt/conda/envs/reinvent.v3.0/lib/python3.7/threading.py", line 289, in wait
TypeError: 'NoneType' object is not callable
Thanks for reporting. The issue should be resolved with the latest update. Please, let us know if the problem still persists.