ASRT_SpeechRecognition icon indicating copy to clipboard operation
ASRT_SpeechRecognition copied to clipboard

关于模型训练报错问题

Open stfeiseu opened this issue 4 years ago • 2 comments

模型在训练时,保存一次模型之后会出现如下错误。本人电脑显卡为GeForce 1060,batch_size设置为8,请问是因为显卡内存不够的原因吗? 2020-06-08 21:06:33.504024: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7845 pciBusID: 0000:01:00.0 2020-06-08 21:06:33.504241: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2020-06-08 21:06:33.504373: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2020-06-08 21:06:33.504513: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2020-06-08 21:06:33.504648: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2020-06-08 21:06:33.504782: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2020-06-08 21:06:33.504917: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2020-06-08 21:06:33.505052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-06-08 21:06:33.505494: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2020-06-08 21:06:33.505660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-08 21:06:33.505794: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2020-06-08 21:06:33.505878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2020-06-08 21:06:33.506282: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4708 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1) Test Count: 0 / 4 2020-06-08 21:06:33.629067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7845 pciBusID: 0000:01:00.0 2020-06-08 21:06:33.629275: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2020-06-08 21:06:33.629405: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2020-06-08 21:06:33.629613: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2020-06-08 21:06:33.629743: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2020-06-08 21:06:33.629874: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2020-06-08 21:06:33.630009: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2020-06-08 21:06:33.630141: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-06-08 21:06:33.630637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2020-06-08 21:06:33.630937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-08 21:06:33.631069: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2020-06-08 21:06:33.631151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2020-06-08 21:06:33.631568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4708 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1) Traceback (most recent call last): File "D:/语音数据集/ASRT_SpeechRecognition-master/train_mspeech.py", line 48, in ms.TrainModel(datapath, epoch = 50, batch_size = 4, save_step = 1000) File "D:\语音数据集\ASRT_SpeechRecognition-master\SpeechModel251.py", line 187, in TrainModel self.TestModel(self.datapath, str_dataset='train', data_count = 4) File "D:\语音数据集\ASRT_SpeechRecognition-master\SpeechModel251.py", line 254, in TestModel pre = self.Predict(data_input, data_input.shape[0] // 8) File "D:\语音数据集\ASRT_SpeechRecognition-master\SpeechModel251.py", line 309, in Predict base_pred = self.base_model.predict(x = x_in) File "C:\Users\Tengfei\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\training.py", line 1462, in predict callbacks=callbacks) File "C:\Users\Tengfei\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\training_arrays.py", line 324, in predict_loop batch_outs = f(ins_batch) File "C:\Users\Tengfei\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\keras\backend.py", line 3473, in call self._make_callable(feed_arrays, feed_symbols, symbol_vals, session) File "C:\Users\Tengfei\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\keras\backend.py", line 3410, in _make_callable callable_fn = session._make_callable_from_options(callable_opts) File "C:\Users\Tengfei\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1505, in _make_callable_from_options return BaseSession._Callable(self, callable_options) File "C:\Users\Tengfei\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1460, in init session._session, options_ptr) tensorflow.python.framework.errors_impl.InvalidArgumentError: Tensor the_input:0, specified in either feed_devices or fetch_devices was not found in the Graph

Process finished with exit code 1

stfeiseu avatar Jun 08 '20 13:06 stfeiseu

关于这个问题可以看我的项目文档,1060的显存只有6GB,应该是不够用的 https://asrt.ailemon.me/docs/issues

nl8590687 avatar Jun 08 '20 13:06 nl8590687

谢谢,我看一下

------------------ 原始邮件 ------------------ 发件人: "nl8590687"<[email protected]>; 发送时间: 2020年6月8日(星期一) 晚上9:52 收件人: "nl8590687/ASRT_SpeechRecognition"<[email protected]>; 抄送: "846738812"<[email protected]>;"Author"<[email protected]>; 主题: Re: [nl8590687/ASRT_SpeechRecognition] 关于模型训练报错问题 (#193)

关于这个问题可以看我的项目文档,1060的显存只有6GB,应该是不够用的 https://asrt.ailemon.me/docs/issues

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

stfeiseu avatar Jun 08 '20 14:06 stfeiseu