ERNIE icon indicating copy to clipboard operation
ERNIE copied to clipboard

text match model , an illegal memory access was encountered.

Open shikeno opened this issue 2 years ago • 1 comments

报错信息: /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py:1492: UserWarning: Skip loading for concat_fc.weight. concat_fc.w is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py:1492: UserWarning: Skip loading for concat_fc.bias. concat.b is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py:1492: UserWarning: Skip loading for output_layer.weight. linear_74.w_0 is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py:1492: UserWarning: Skip loading for output_layer.bias. linear_74.b_0 is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) Traceback (most recent call last): File "run_trainer.py", line 124, in run_trainer(_params) File "run_trainer.py", line 101, in run_trainer trainer.do_train() File "/home/aistudio/work/ERNIE/applications/tasks/text_matching/trainer/custom_dynamic_trainer.py", line 55, in do_train forward_out = self.model_class(example, phase=InstanceName.TRAINING) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, **kwargs) File "/home/aistudio/work/ERNIE/applications/tasks/text_matching/model/ernie_matching_siamese_pointwise.py", line 68, in forward task_ids=text_task_ids) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, **kwargs) File "../../../erniekit/modules/ernie.py", line 169, in forward sent_embedded = self.sent_emb(sent_ids) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/common.py", line 1469, in forward name=self._name) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/functional/input.py", line 206, in embedding 'remote_prefetch', False, 'padding_idx', padding_idx) OSError: (External) CUDA error(700), an illegal memory access was encountered. [Hint: 'cudaErrorIllegalAddress'. The device encountered a load or store instruction on an invalid memory address. This leaves the process in an inconsistentstate and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched. ] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:258) [operator < lookup_table_v2 > error]

我参考了https://github.com/PaddlePaddle/ERNIE/issues/466 ,不过这里比较久远了。 使用的环境是ai studio 上的V100 环境,paddlepaddle-gpu==2.3.2, paddlenlp==2.3.2

shikeno avatar Jul 13 '23 03:07 shikeno

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reopen it. Thank you for your contributions.

stale[bot] avatar Sep 17 '23 00:09 stale[bot]