AK-DE-biGRU icon indicating copy to clipboard operation
AK-DE-biGRU copied to clipboard

DefaultCPUAllocator: not enough memory

Open ChenYaChu opened this issue 6 years ago • 11 comments

I tried to python python -u run_models.py --h_dim 300 --mb_size 32 --n_epoch 20 --gpu --lr 0.0001 till i got

RuntimeError: [enforce fail at ..\c10\core\CPUAllocator.cpp:62] data. DefaultCPUAllocator: not enough memory: you tried to allocate %dGB. Buy new RAM!208096001

I have 48GB of RAM and a 1060 GPU (6GB), is is not enough?

how should i do? thanks

ChenYaChu avatar Jul 30 '19 05:07 ChenYaChu

the model keeps the pre-processed data into memory, I don't think decreasing the batch size will help. can you post the full error stack trace?

deepcode-debug avatar Jul 30 '19 16:07 deepcode-debug

I restart pre-processed.py, and run run_models.py I got

Finished loading dataset! D:\chen_ya\AK-DE-biGRU-master\models.py:484: UserWarning: nn.init.xavier_normal is now deprecated in favor of nn.init.xavier_normal_. nn.init.xavier_normal(self.M)


Epoch-0

Training: 0it [00:00, ?it/s]Traceback (most recent call last): File "D:/chen_ya/AK-DE-biGRU-master/run_models.py", line 140, in run_model() File "D:/chen_ya/AK-DE-biGRU-master/run_models.py", line 84, in run_model output = model(context, response, cm, rm, key_r, key_mask_r) # Appropriate this line while running different models File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call result = self.forward(*input, **kwargs) File "D:\chen_ya\AK-DE-biGRU-master\models.py", line 513, in forward sc, sr, c, r = self.forward_enc(x1, x2, key_emb_r) File "D:\chen_ya\AK-DE-biGRU-master\models.py", line 555, in forward_enc x1_emb = self.emb_drop(self.word_embed(x1)) # B X S X E File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call result = self.forward(*input, **kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\sparse.py", line 117, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\functional.py", line 1506, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got CPUType instead (while checking arguments for embedding)

Is that my pytorch version's problem? my pytorch version is 1.1.0

ChenYaChu avatar Jul 31 '19 02:07 ChenYaChu

no, looks like you are running it on cpu, did you use the argument for running it on the gpu?

deepcode-debug avatar Jul 31 '19 14:07 deepcode-debug

yes, I try it again and it show RuntimeError: CuDNN error: CUDNN_STATUS_SUCCESS

i cheak my CUDA and CUDNN, and their version is 9.1 & 7.1.2 Is that mean I have wrong version?

ChenYaChu avatar Aug 01 '19 07:08 ChenYaChu

maybe check your pytorch gpu configuration/version properly with this? In [1]: import torch

In [2]: torch.cuda.current_device() Out[2]: 0

In [3]: torch.cuda.device(0) Out[3]: <torch.cuda.device at 0x7efce0b03be0> In [4]: torch.cuda.device_count() Out[4]: 1 In [5]: torch.cuda.get_device_name(0) Out[5]: 'GeForce GTX 950M' In [6]: torch.cuda.is_available() Out[6]: True

deepcode-debug avatar Aug 03 '19 14:08 deepcode-debug

I finally run success ,but when i run finish it show that

2019-08-06 11-58-12 的螢幕擷圖

how should I solve it ? thanks

ChenYaChu avatar Aug 06 '19 04:08 ChenYaChu

cool that you could run it. Can you please change loss.data[0] to loss.item() and provide the results?

deepcode-debug avatar Aug 06 '19 10:08 deepcode-debug

thanks very much I finished training the data but I wanna know how to chat and test on my computer?

ChenYaChu avatar Aug 11 '19 04:08 ChenYaChu

provide the model with a question utterance and a set of possible response, the output would be a predicted response.

deepcode-debug avatar Aug 12 '19 08:08 deepcode-debug

should I run something when I provide the model with a question utterance?

ChenYaChu avatar Aug 13 '19 02:08 ChenYaChu

I don't have anything readymade. You need to reuse parts from the batcher (data) and run_models to get the predictions.

deepcode-debug avatar Aug 13 '19 12:08 deepcode-debug