OpenKE icon indicating copy to clipboard operation
OpenKE copied to clipboard

不支持大知识图谱的训练

Open wjy3326 opened this issue 2 years ago • 4 comments

我目前的知识图谱实体在1500万个左右,在运用transE进行训练时显示oom错误,即使batch size设置为1,也会出现oom错误,请问怎么解决呢? Traceback (most recent call last): File "train_transe_WN18_adv_sigmoidloss.py", line 52, in trainer.run() File "/ai-images/wjy/event_extraction/recommendation/OpenKE-PyTorch/openke/config/Trainer.py", line 93, in run loss = self.train_one_step(data) File "/ai-images/wjy/event_extraction/recommendation/OpenKE-PyTorch/openke/config/Trainer.py", line 52, in train_one_step loss.backward() File "/home/user/anaconda3/envs/image_text_match_tf_1.15/lib/python3.6/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/user/anaconda3/envs/image_text_match_tf_1.15/lib/python3.6/site-packages/torch/autograd/init.py", line 147, in backward allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag RuntimeError: CUDA out of memory. Tried to allocate 5.60 GiB (GPU 0; 23.70 GiB total capacity; 16.99 GiB already allocated; 1.43 GiB free; 17.00 GiB reserved in total by PyTorch)

wjy3326 avatar Jul 22 '22 08:07 wjy3326

你有找到其他支持大知识图谱的开源代码吗

YBAgg avatar Jul 29 '22 08:07 YBAgg

cpu怎么样

hopegithub avatar Aug 19 '22 07:08 hopegithub

你有找到其他支持大知识图谱的开源代码吗

我觉得可以试试DGL-KE?不知道你有了解过没

YijianLiu avatar Oct 06 '22 13:10 YijianLiu

你有找到其他支持大知识图谱的开源代码吗

我觉得可以试试DGL-KE?不知道你有了解过没

@YijianLiu DGL-KE感觉调试比这个OpenKE麻烦,而且那个不是源码,直接终端指令一键生成

stupidoge avatar Aug 02 '23 05:08 stupidoge