Text-Summarizer-Pytorch-Chinese icon indicating copy to clipboard operation
Text-Summarizer-Pytorch-Chinese copied to clipboard

训练时报错

Open GUZIYIN opened this issue 3 years ago • 4 comments

请问博主,训练时会报如下错误:/opt/conda/conda-bld/pytorch_1587428091666/work/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: sampleMultinomialOnce: block: [5,0,0], thread: [800,0,0] Assertion val >= zero failed. 然后一直是data_util.log - INFO - Bucket queue size: 1000, Input queue size: 10000,请问这怎么解决呢?

GUZIYIN avatar Jan 08 '21 01:01 GUZIYIN

看到一个类似的,https://github.com/pytorch/pytorch/issues/10303,有可能是参数哪里不对,导致GPU跑的时候出错吧

GUZIYIN avatar Jan 08 '21 01:01 GUZIYIN

请问还有更详细的信息吗

LowinLi avatar Jan 11 '21 04:01 LowinLi

估计是你参数设置的有问题,训练次数调大点,至少大于50,因为每50次才会出现loss损失值。

yamonc avatar Feb 02 '21 09:02 yamonc

请问博主,训练时会报如下错误:/opt/conda/conda-bld/pytorch_1587428091666/work/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: sampleMultinomialOnce: block: [5,0,0], thread: [800,0,0] Assertion val >= zero failed. 然后一直是data_util.log - INFO - Bucket queue size: 1000, Input queue size: 10000,请问这怎么解决呢?

我也遇到一样的问题,您是怎么解决的啊

Leitao1986 avatar Feb 03 '21 09:02 Leitao1986