训练自己的数据集,出现问题
请问我基于你的代码和流程做自己的数据集,总共9万多张image,400条boat,200多个model,训练一开始报如下错误,请教一下会是什么原因呢?
Training... => Epoch Train loss Train acc Test acc /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [12,0,0] Assertion t >= 0 && t < n_classesfailed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [15,0,0] Assertiont >= 0 && t < n_classesfailed. Traceback (most recent call last): File "RepNet.py", line 1606, in <module> train(resume=None) # 从头开始训练 File "RepNet.py", line 1541, in train epoch_loss.append(loss.cpu().item()) RuntimeError: CUDA error: device-side assert triggered
@BattleZhan 根据报错,很有可能您的输出类别数量不正确,可以仔细检查一下代码。
@BattleZhan 根据自己的数据集,修改网络,输出正确的类别数
好的谢谢您,可能是我处理的脚本有不对的地方我再仔细检测一下