PaddleNLP icon indicating copy to clipboard operation
PaddleNLP copied to clipboard

[Question]: UIE压缩到一半报错

Open yuwochangzai opened this issue 2 years ago • 8 comments

使用如下命令压缩: python finetune.py
--device cpu
--logging_steps 10
--save_steps 100
--eval_steps 100
--seed 42
--model_name_or_path ./checkpoint/model_best
--output_dir export
--train_path data/train.txt
--dev_path data/dev.txt
--max_seq_length 512
--per_device_eval_batch_size 16
--per_device_train_batch_size 1
--num_train_epochs 1
--learning_rate 1e-5
--do_compress True
--overwrite_output_dir
--disable_tqdm True
--metric_for_best_model eval_f1
--save_total_limit 1
--strategy 'qat' \

报错信息如下: `[2022-11-04 07:04:34,524] [ INFO] - f1: 0.6206896551724138, precision: 0.782608695652174, recall: 0.6206896551724138 [2022-11-04 07:04:34,527] [ INFO] - eval done total: 41.88436722755432 s [2022-11-04 07:05:16,472] [ INFO] - global step 510, epoch: 0, batch: 509, loss: 0.000004, speed: 0.12 step/s [2022-11-04 07:05:58,469] [ INFO] - global step 520, epoch: 0, batch: 519, loss: 0.000007, speed: 0.24 step/s [2022-11-04 07:06:39,883] [ INFO] - global step 530, epoch: 0, batch: 529, loss: 0.000008, speed: 0.24 step/s [2022-11-04 07:07:22,352] [ INFO] - global step 540, epoch: 0, batch: 539, loss: 0.000049, speed: 0.24 step/s [2022-11-04 07:08:07,073] [ INFO] - global step 550, epoch: 0, batch: 549, loss: 0.000022, speed: 0.22 step/s [2022-11-04 07:08:17,428] [ INFO] - Best result: 0.6667 Traceback (most recent call last): File "/usr/projects/uie-3496/finetune.py", line 342, in main() File "/usr/projects/uie-3496/finetune.py", line 338, in main trainer.compress(custom_evaluate=custom_evaluate) File "/home/icvip/.local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 91, in compress self.quant(args.output_dir, args.strategy) File "/home/icvip/.local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 101, in quant _quant_aware_training_dynamic(self, model_dir) File "/home/icvip/.local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 752, in _quant_aware_training_dynamic quanter.save_quantized_model(self.model, File "/home/icvip/.local/lib/python3.9/site-packages/paddleslim/dygraph/quant/qat.py", line 289, in save_quantized_model self.imperative_qat.save_quantized_model( File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/contrib/slim/quantization/imperative/qat.py", line 273, in save_quantized_model self._quantize_outputs.save_quantized_model(layer, path, input_spec, File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/contrib/slim/quantization/imperative/qat.py", line 483, in save_quantized_model paddle.jit.save(layer=model, path=path, input_spec=input_spec, **config) File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/jit.py", line 631, in wrapper func(layer, path, input_spec, **configs) File "/home/icvip/.local/lib/python3.9/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, **kwargs) File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/base.py", line 51, in impl return func(*args, **kwargs) File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/jit.py", line 871, in save concrete_program = static_forward.concrete_program_specify_input_spec( File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 527, in concrete_program_specify_input_spec concrete_program, _ = self.get_concrete_program( File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 436, in get_concrete_program concrete_program, partial_program_layer = self._program_cache[cache_key] File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 801, in getitem self._caches[item_id] = self._build_once(item) File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 785, in build_once concrete_program = ConcreteProgram.from_func_spec( File "/home/icvip/.local/lib/python3.9/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, **kwargs) File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/base.py", line 51, in impl return func(*args, **kwargs) File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 740, in from_func_spec error_data.raise_new_exception() File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/error.py", line 336, in raise_new_exception six.exec("raise new_exception from None") File "", line 1, in TypeError: In transformed code:

File "/usr/projects/uie-3496/model.py", line 31, in forward
    def forward(self, input_ids, token_type_ids, pos_ids=None, att_mask=None):
        sequence_output, _ = self.encoder(input_ids=input_ids,
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                                          token_type_ids=token_type_ids,
                                          position_ids=pos_ids,

File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
File "/tmp/tmpa2cqs75v.py", line 93, in auto_model_forward
    ] = paddle.jit.dy2static.convert_while_loop(for_loop_condition_0,
File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 45, in convert_while_loop
    loop_vars = _run_py_while(cond, body, loop_vars)
File "/home/icvip/.local/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 59, in _run_py_while
    loop_vars = body(*loop_vars)
File "/tmp/tmpa2cqs75v.py", line 88, in for_loop_body_0
    __for_loop_iter_var_0 = kwargs_keys[__for_loop_var_index_0]

TypeError: 'odict_keys' object is not subscriptable`

yuwochangzai avatar Nov 04 '22 08:11 yuwochangzai

显存方面是否有溢出 ?

wawltor avatar Nov 04 '22 09:11 wawltor

显存方面是否有溢出 ?

用cpu压缩的

yuwochangzai avatar Nov 04 '22 09:11 yuwochangzai

显存方面是否有溢出 ?

你好,我也遇到了同样的问题,gpu上进行压缩,报了同样的错误,显存并没有溢出,我设定的epoch是100,在最后一个step时报了错 报错前的最后日志信息如下: [2022-11-04 17:55:29,124] [ INFO] - global step 9690, epoch: 99, batch: 86, loss: 0.000001, speed: 4.03 step/s [2022-11-04 17:55:31,612] [ INFO] - global step 9700, epoch: 99, batch: 96, loss: 0.000000, speed: 4.04 step/s [2022-11-04 17:55:36,793] [ INFO] - f1: 0.6990881458966565, precision: 0.732484076433121, recall: 0.6990881458966565 [2022-11-04 17:55:36,795] [ INFO] - eval done total: 5.182710409164429 s [2022-11-04 17:55:36,796] [ INFO] - Best result: 0.7130 Traceback (most recent call last): File ".\finetune.py", line 289, in

starryzwh avatar Nov 04 '22 10:11 starryzwh

您好~由于在我本地环境没有复现,所以二位可以提供更多的环境信息吗,比如paddlepaddle、paddlenlp、python的版本等,还有微调时使用的UIE预训练模型是?@starryzwh 报错信息方便的话也粘贴一下吧

LiuChiachi avatar Nov 04 '22 11:11 LiuChiachi

看起来升级到2.4rc0的版本可以解决问题 image

wawltor avatar Nov 04 '22 12:11 wawltor

您好~由于在我本地环境没有复现,所以二位可以提供更多的环境信息吗,比如paddlepaddle、paddlenlp、python的版本等,还有微调时使用的UIE预训练模型是?@starryzwh 报错信息方便的话也粘贴一下吧

您好,环境为win11、paddlepaddle-gpu==2.3.2、paddlenlp==2.4.1、 paddleslim==2.3.4、python==3.8 以下是详细报错信息: `[2022-11-04 17:55:26,605] [ INFO] - global step 9680, epoch: 99, batch: 76, loss: 0.000004, speed: 4.01 step/s [2022-11-04 17:55:29,124] [ INFO] - global step 9690, epoch: 99, batch: 86, loss: 0.000001, speed: 4.03 step/s [2022-11-04 17:55:31,612] [ INFO] - global step 9700, epoch: 99, batch: 96, loss: 0.000000, speed: 4.04 step/s [2022-11-04 17:55:36,793] [ INFO] - f1: 0.6990881458966565, precision: 0.732484076433121, recall: 0.6990881458966565 [2022-11-04 17:55:36,795] [ INFO] - eval done total: 5.182710409164429 s [2022-11-04 17:55:36,796] [ INFO] - Best result: 0.7130 Traceback (most recent call last): File ".\finetune.py", line 289, in main() File ".\finetune.py", line 285, in main trainer.compress(custom_evaluate=custom_evaluate) File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\paddlenlp\trainer\trainer_compress.py", line 91, in compress self.quant(args.output_dir, args.strategy) File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\paddlenlp\trainer\trainer_compress.py", line 101, in quant _quant_aware_training_dynamic(self, model_dir) File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\paddlenlp\trainer\trainer_compress.py", line 752, in _quant_aware_training_dynamic quanter.save_quantized_model(self.model, File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\paddleslim\dygraph\quant\qat.py", line 289, in save_quantized_model self.imperative_qat.save_quantized_model( File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\contrib\slim\quantization\imperative\qat.py", line 273, in save_quantized_model self._quantize_outputs.save_quantized_model(layer, path, input_spec, File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\contrib\slim\quantization\imperative\qat.py", line 483, in save_quantized_model paddle.jit.save(layer=model, path=path, input_spec=input_spec, **config) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\jit.py", line 631, in wrapper func(layer, path, input_spec, **configs) File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl return wrapped_func(*args, **kwargs) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\base.py", line 51, in impl return func(*args, **kwargs) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\jit.py", line 871, in save concrete_program = static_forward.concrete_program_specify_input_spec( File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 527, in concrete_program_specify_input_spec concrete_program, _ = self.get_concrete_program( File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 436, in get_concrete_program concrete_program, partial_program_layer = self._program_cache[cache_key] File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 801, in getitem self._caches[item_id] = self._build_once(item) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 785, in build_once concrete_program = ConcreteProgram.from_func_spec( File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl return wrapped_func(*args, **kwargs) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\base.py", line 51, in impl return func(*args, **kwargs) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 740, in from_func_spec error_data.raise_new_exception() File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\error.py", line 336, in raise_new_exception six.exec("raise new_exception from None") File "", line 1, in TypeError: In transformed code:

File "D:\PaddleNLP_UIE\model.py", line 31, in forward
    def forward(self, input_ids, token_type_ids, pos_ids=None, att_mask=None):
        sequence_output, _ = self.encoder(input_ids=input_ids,
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                                          token_type_ids=token_type_ids,
                                          position_ids=pos_ids,

File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
File "C:\Users\46383\AppData\Local\Temp\tmp377329ie.py", line 93, in auto_model_forward
    ] = paddle.jit.dy2static.convert_while_loop(for_loop_condition_0,
File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\convert_operators.py", line 45, in convert_while_loop
    loop_vars = _run_py_while(cond, body, loop_vars)
File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\convert_operators.py", line 59, in _run_py_while
    loop_vars = body(*loop_vars)
File "C:\Users\46383\AppData\Local\Temp\tmp377329ie.py", line 88, in for_loop_body_0
    __for_loop_iter_var_0 = kwargs_keys[__for_loop_var_index_0]

TypeError: 'odict_keys' object is not subscriptable`

starryzwh avatar Nov 07 '22 01:11 starryzwh

您好~由于在我本地环境没有复现,所以二位可以提供更多的环境信息吗,比如paddlepaddle、paddlenlp、python的版本等,还有微调时使用的UIE预训练模型是?@starryzwh 报错信息方便的话也粘贴一下吧

您好,环境为win11、paddlepaddle-gpu==2.3.2、paddlenlp==2.4.1、 paddleslim==2.3.4、python==3.8 以下是详细报错信息: `[2022-11-04 17:55:26,605] [ INFO] - global step 9680, epoch: 99, batch: 76, loss: 0.000004, speed: 4.01 step/s [2022-11-04 17:55:29,124] [ INFO] - global step 9690, epoch: 99, batch: 86, loss: 0.000001, speed: 4.03 step/s [2022-11-04 17:55:31,612] [ INFO] - global step 9700, epoch: 99, batch: 96, loss: 0.000000, speed: 4.04 step/s [2022-11-04 17:55:36,793] [ INFO] - f1: 0.6990881458966565, precision: 0.732484076433121, recall: 0.6990881458966565 [2022-11-04 17:55:36,795] [ INFO] - eval done total: 5.182710409164429 s [2022-11-04 17:55:36,796] [ INFO] - Best result: 0.7130 Traceback (most recent call last): File ".\finetune.py", line 289, in main() File ".\finetune.py", line 285, in main trainer.compress(custom_evaluate=custom_evaluate) File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\paddlenlp\trainer\trainer_compress.py", line 91, in compress self.quant(args.output_dir, args.strategy) File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\paddlenlp\trainer\trainer_compress.py", line 101, in quant _quant_aware_training_dynamic(self, model_dir) File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\paddlenlp\trainer\trainer_compress.py", line 752, in _quant_aware_training_dynamic quanter.save_quantized_model(self.model, File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\paddleslim\dygraph\quant\qat.py", line 289, in save_quantized_model self.imperative_qat.save_quantized_model( File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\contrib\slim\quantization\imperative\qat.py", line 273, in save_quantized_model self._quantize_outputs.save_quantized_model(layer, path, input_spec, File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\contrib\slim\quantization\imperative\qat.py", line 483, in save_quantized_model paddle.jit.save(layer=model, path=path, input_spec=input_spec, **config) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\jit.py", line 631, in wrapper func(layer, path, input_spec, **configs) File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl return wrapped_func(*args, **kwargs) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\base.py", line 51, in impl return func(*args, **kwargs) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\jit.py", line 871, in save concrete_program = static_forward.concrete_program_specify_input_spec( File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 527, in concrete_program_specify_input_spec concrete_program, _ = self.get_concrete_program( File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 436, in get_concrete_program concrete_program, partial_program_layer = self._program_cache[cache_key] File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 801, in getitem self._caches[item_id] = self._build_once(item) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 785, in build_once concrete_program = ConcreteProgram.from_func_spec( File "C:\Users\46383\software\miniconda3\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl return wrapped_func(*args, **kwargs) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\base.py", line 51, in impl return func(*args, **kwargs) File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\program_translator.py", line 740, in from_func_spec error_data.raise_new_exception() File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\error.py", line 336, in raise_new_exception six.exec("raise new_exception from None") File "", line 1, in TypeError: In transformed code:

File "D:\PaddleNLP_UIE\model.py", line 31, in forward
    def forward(self, input_ids, token_type_ids, pos_ids=None, att_mask=None):
        sequence_output, _ = self.encoder(input_ids=input_ids,
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                                          token_type_ids=token_type_ids,
                                          position_ids=pos_ids,

File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
File "C:\Users\46383\AppData\Local\Temp\tmp377329ie.py", line 93, in auto_model_forward
    ] = paddle.jit.dy2static.convert_while_loop(for_loop_condition_0,
File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\convert_operators.py", line 45, in convert_while_loop
    loop_vars = _run_py_while(cond, body, loop_vars)
File "C:\Users\46383\AppData\Roaming\Python\Python38\site-packages\paddle\fluid\dygraph\dygraph_to_static\convert_operators.py", line 59, in _run_py_while
    loop_vars = body(*loop_vars)
File "C:\Users\46383\AppData\Local\Temp\tmp377329ie.py", line 88, in for_loop_body_0
    __for_loop_iter_var_0 = kwargs_keys[__for_loop_var_index_0]

TypeError: 'odict_keys' object is not subscriptable`

升级paddle的版本到最新2.4rc版本

wawltor avatar Nov 07 '22 01:11 wawltor

升级paddle的版本到最新2.4rc版本

您好,按照您说的,用conda install paddle paddlepaddle-gpu安装到了2.4.0rc0版本,在进行finetune模型压缩时,使用gpu,info信息到device,直接退出了,在使用cpu时,info到device时,会往下进行,但是很慢很慢,以下是相关信息 python .\finetune.py --train_path .\data\train.txt --dev_path .\data\dev.txt --output_dir .\checkpoint\model_best --learning_rate 1e-5 --per_device_eval_batch_size 8 --per_device_train_batch_size 8 --max_seq_len 512 --num_train_epochs 50 --model_name_or_path .\checkpoint\model_best --seed 1000 --logging_steps 10 --eval_steps 100 --save_steps 100 --device gpu --do_compress --overwrite_output_dir --disable_tqdm True --metric_for_best_model eval_f1 --save_total_limit 1 --strategy qat [2022-11-07 16:29:59,313] [ INFO] - ============================================================ [2022-11-07 16:29:59,316] [ INFO] - Model Configuration Arguments [2022-11-07 16:29:59,317] [ INFO] - paddle commit id :083853cd4e4a9bdad22c70fa48eb9a036d2def27 [2022-11-07 16:29:59,318] [ INFO] - export_model_dir :None [2022-11-07 16:29:59,318] [ INFO] - model_name_or_path :.\checkpoint\model_best [2022-11-07 16:29:59,318] [ INFO] - multilingual :False [2022-11-07 16:29:59,318] [ INFO] - [2022-11-07 16:29:59,319] [ INFO] - ============================================================ [2022-11-07 16:29:59,319] [ INFO] - Data Configuration Arguments [2022-11-07 16:29:59,319] [ INFO] - paddle commit id :083853cd4e4a9bdad22c70fa48eb9a036d2def27 [2022-11-07 16:29:59,320] [ INFO] - dev_path :.\data\dev.txt [2022-11-07 16:29:59,320] [ INFO] - max_seq_length :512 [2022-11-07 16:29:59,320] [ INFO] - train_path :.\data\train.txt [2022-11-07 16:29:59,321] [ INFO] - [2022-11-07 16:29:59,321] [ WARNING] - Process rank: -1, device: gpu, world_size: 1, distributed training: False, 16-bits training: False [2022-11-07 16:29:59,322] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load '.\checkpoint\model_best'. W1107 16:29:59.363667 20828 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 10.2 W1107 16:29:59.423164 20828 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6. (paddle_test) PS D:\信息抽取-PaddleNLP_UIE> 这是gpu下直接退出的情形 [2022-11-07 15:15:38,460] [ INFO] - Gradient Accumulation steps = 1 [2022-11-07 15:15:38,460] [ INFO] - Total optimization steps = 3860.0 [2022-11-07 15:15:38,460] [ INFO] - Total num train samples = 15440 [2022-11-07 15:19:04,664] [ INFO] - loss: 0.0056895, learning_rate: 1e-05, global_step: 10, interval_runtime: 205.5363, interval_samples_per_second: 0.019, interval_steps_per_second: 0.049, epoch: 0.0518 [2022-11-07 15:22:25,579] [ INFO] - loss: 0.00726472, learning_rate: 1e-05, global_step: 20, interval_runtime: 197.9083, interval_samples_per_second: 0.02, interval_steps_per_second: 0.051, epoch: 0.1036 [2022-11-07 15:25:52,342] [ INFO] - loss: 0.00368861, learning_rate: 1e-05, global_step: 30, interval_runtime: 206.7552, interval_samples_per_second: 0.019, interval_steps_per_second: 0.048, epoch: 0.1554 这是cpu下的情况,不知为何gpu下会直接退出?

starryzwh avatar Nov 07 '22 08:11 starryzwh

你那边gpu进行微调训练是可以的吗@starryzwh

LiuChiachi avatar Nov 08 '22 03:11 LiuChiachi

@yuwochangzai 更新下paddlepaddle版本到2.4.0rc0再试下呢

LiuChiachi avatar Nov 08 '22 03:11 LiuChiachi

你那边gpu进行微调训练是可以的吗@starryzwh 微调是可以的,版本也已经更新到2.4.0rc0,刚在另一个issue里看到@wawltor大佬说是paddlenlp需要develop版本的,我整尝试试一下此方法

starryzwh avatar Nov 08 '22 05:11 starryzwh

@yuwochangzai 更新下paddlepaddle版本到2.4.0rc0再试下呢

成功了,谢谢! image

yuwochangzai avatar Nov 08 '22 08:11 yuwochangzai

@starryzwh 你直接退出的问题怎么解决的,我也遇到了同样的问题

noobexplore avatar Dec 30 '22 04:12 noobexplore