DrQA icon indicating copy to clipboard operation
DrQA copied to clipboard

AssertionError: Torch not compiled with CUDA enabled

Open balagopal24 opened this issue 4 years ago • 2 comments

$ python3 train.py -e 40 -bs 32

02/15/2020 05:17:11 [Program starts. Loading data...] 02/15/2020 05:22:48 {'log_per_updates': 3, 'data_file': 'SQuAD/data.msgpack', 'model_dir': '/Users/balagopalbhallamudi/Desktop/DrQA/models', 'save_last_only': False, 'save_dawn_logs': False, 'seed': 1013, 'cuda': False, 'epochs': 40, 'batch_size': 32, 'resume': '', 'resume_options': False, 'reduce_lr': 0.0, 'optimizer': 'adamax', 'grad_clipping': 10, 'weight_decay': 0, 'learning_rate': 0.1, 'momentum': 0, 'tune_partial': 1000, 'fix_embeddings': False, 'rnn_padding': False, 'question_merge': 'self_attn', 'doc_layers': 3, 'question_layers': 3, 'hidden_size': 128, 'num_features': 4, 'pos': True, 'ner': True, 'use_qemb': True, 'concat_rnn_layers': True, 'dropout_emb': 0.4, 'dropout_rnn': 0.4, 'dropout_rnn_output': True, 'max_len': 15, 'rnn_type': 'lstm', 'pretrained_words': True, 'vocab_size': 91590, 'embedding_dim': 300, 'pos_size': 50, 'ner_size': 19} 02/15/2020 05:22:48 [Data loaded.] 02/15/2020 05:22:48 Epoch 1 02/15/2020 07:07:48 > epoch [ 1] updates[ 2707] train loss[4.38260] remaining[0:00:00]

02/15/2020 07:09:46 dev EM: 53.140964995269634 F1: 64.78947947738538 Traceback (most recent call last): File "train.py", line 377, in main() File "train.py", line 87, in main model.save(model_file, epoch, [em, f1, best_val_score]) File "/Users/balagopalbhallamudi/Desktop/DrQA/drqa/model.py", line 147, in save 'torch_cuda_state': torch.cuda.get_rng_state() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/cuda/random.py", line 20, in get_rng_state _lazy_init() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/cuda/init.py", line 196, in _lazy_init _check_driver() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/cuda/init.py", line 94, in _check_driver raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled (base) Balagopals-MacBook-Pro:DrQA balagopalbhallamudi$ python3 interact.py Traceback (most recent call last): File "interact.py", line 31, in checkpoint = torch.load(args.model_file, map_location=lambda storage, loc: storage) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 525, in load with _open_file_like(f, 'rb') as opened_file: File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 212, in _open_file_like return _open_file(name_or_buffer, mode) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 193, in init super(_open_file, self).init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: 'models/best_model.pt'

balagopal24 avatar Feb 15 '20 05:02 balagopal24

I haven't fully tested the code in pure-CPU training. It seems like you'll have to remove the line 'torch_cuda_state': torch.cuda.get_rng_state() for CPU-only training.

hitvoice avatar May 12 '20 03:05 hitvoice

I can confirm that @hitvoice solution worked. In model.py, in save function, remove 'torch_cuda_state': torch.cuda.get_rng_state() and it will work. Model is getting saved without cuda

namratasaun avatar Jun 08 '21 06:06 namratasaun