PaddleNLP [Question]: 模型微调后怎么使用

请问模型ernie_layout，做doc_vqa的模型微调，训练完成后（由于整体的训练时间太长，我就训练了1epoch，看下效果，并且现在是40000的checkpoint），我要怎么用这个微调的模型就行预测？

文档中没有告诉我怎么直接通过微调的模型进行预测，即指定checkpoint进行预测，我只看到do_train的时候可以从checkpoint出重新开始训练，但是do_predict的时候，并不会使用output_dir/checkpoint的模型？如果我指定--do_predict要用什么参数进行预测，需要怎么改，帮忙提供下
是否一定要将动态图转换成静态部署之后才能预测？正常应该不需要吧，但是没有看到文档，是否可以提供下文档说明
同时我也尝试export_model，但是报错这是我的output_dir，是否一定要训练完成才能export，才能预测？

├── ernie-layoutx-base-uncased │ └── models │ └── docvqa_zh │ ├── checkpoint-30000 │ │ ├── model_config.json │ │ ├── model_state.pdparams │ │ ├── optimizer.pdopt │ │ ├── rng_state.pth │ │ ├── scheduler.pdparams │ │ ├── sentencepiece.bpe.model │ │ ├── special_tokens_map.json │ │ ├── tokenizer_config.json │ │ ├── trainer_state.json │ │ ├── training_args.bin │ │ └── vocab.txt │ ├── checkpoint-40000 │ │ ├── model_config.json │ │ ├── model_state.pdparams │ │ ├── optimizer.pdopt │ │ ├── rng_state.pth │ │ ├── scheduler.pdparams │ │ ├── sentencepiece.bpe.model │ │ ├── special_tokens_map.json │ │ ├── tokenizer_config.json │ │ ├── trainer_state.json │ │ ├── training_args.bin │ │ └── vocab.txt │ ├── eval_golden_labels.json │ ├── eval_predictions.json │ └── runs │

/usr/bin/env /home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/bin/python /home/hehuang/.vscode/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/launcher 38467 -- /home/hehuang/dev/git/PaddleNLP/model_zoo/ernie-layout/export_model.py --model_path ./ernie-layoutx-base-uncased/models/docvqa_zh/ --task_type mrc --output_path ./mrc_export [2022-10-27 15:49:56,644] [ INFO] - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/models/community/./ernie-layoutx-base-uncased/models/docvqa_zh/model_config.json [2022-10-27 15:49:56,871] [ ERROR] - Downloading from https://bj.bcebos.com/paddlenlp/models/community/./ernie-layoutx-base-uncased/models/docvqa_zh/model_config.json failed with code 404! Traceback (most recent call last): File "/home/hehuang/dev/git/PaddleNLP/paddlenlp/transformers/auto/modeling.py", line 278, in _from_pretrained resolved_vocab_file = get_path_from_url(community_config_path, File "/home/hehuang/dev/git/PaddleNLP/paddlenlp/utils/downloader.py", line 164, in get_path_from_url fullpath = _download(url, root_dir, md5sum) File "/home/hehuang/dev/git/PaddleNLP/paddlenlp/utils/downloader.py", line 200, in _download raise RuntimeError("Downloading from {} failed with code " RuntimeError: Downloading from https://bj.bcebos.com/paddlenlp/models/community/./ernie-layoutx-base-uncased/models/docvqa_zh/model_config.json failed with code 404!

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/hehuang/.vscode/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/main.py", line 45, in cli.main() File "/home/hehuang/.vscode/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main run() File "/home/hehuang/.vscode/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file runpy.run_path(target_as_str, run_name=compat.force_str("main")) File "/home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/lib/python3.9/runpy.py", line 288, in run_path return _run_module_code(code, init_globals, run_name, File "/home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/lib/python3.9/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/hehuang/dev/git/PaddleNLP/model_zoo/ernie-layout/export_model.py", line 33, in model = AutoModelForQuestionAnswering.from_pretrained(args.model_path) File "/home/hehuang/dev/git/PaddleNLP/paddlenlp/transformers/auto/modeling.py", line 597, in from_pretrained return cls._from_pretrained(pretrained_model_name_or_path, *model_args, File "/home/hehuang/dev/git/PaddleNLP/paddlenlp/transformers/auto/modeling.py", line 282, in _from_pretrained raise RuntimeError( RuntimeError: Can't load weights for './ernie-layoutx-base-uncased/models/docvqa_zh/'. Please make sure that './ernie-layoutx-base-uncased/models/docvqa_zh/' is:

a correct model-identifier of built-in pretrained models,
or a correct model-identifier of community-contributed pretrained models,
or the correct path to a directory containing relevant modeling files(model_weights and model_config)

Oct 27 '22 07:10 hehuang139

看了下源码，发现每个save_steps会先保存到checkpoint中，checkpoint也是存储模型，也是通过model.save方式保存的，只有最后完成训练后，比较最优才保存到output_dir中。即只有训练完成才能导出模型。

如果想要用checkpoint的模型也是可以的，只要制定下本地模型路径到checkpoint的路径即可

也就是如果想要使用checkpoint的微调模型，参数是要修改的model_name_or_path指向本地的模型checkpoint路径

python3 -u run_mrc.py
--model_name_or_path ./ernie-layoutx-base-uncased/models/docvqa_zh/checkpoint-40000
--output_dir ./predict_result
--dataset_name docvqa_zh
--do_predict
--lang "ch"
--num_train_epochs 6
--lr_scheduler_type linear
--warmup_ratio 0.05
--weight_decay 0
--pattern "mrc"
--use_segment_box false
--return_entity_level_metrics false
--overwrite_cache false
--doc_stride 128
--target_size 1000
--per_device_train_batch_size 8
--per_device_eval_batch_size 8
--learning_rate 2e-5
--preprocessing_num_workers 32
--save_total_limit 1
--train_nshard 16
--seed 1000
--metric_for_best_model anls
--greater_is_better true

Oct 27 '22 08:10 hehuang139

通过checkpoint导出也是可以的，就是将model_path指向checkpoint的路径即可

Oct 27 '22 08:10 hehuang139

请问docvqa_zh这个数据集在哪里下载呢

Nov 10 '22 07:11 chaiyixuan

请问docvqa_zh这个数据集在哪里下载呢

跑阅读理解run_mrc.py就会自动下载，这个机制是paddle的datasets机制的_split_generators（其实和huggingface的datasets一摸一样的，如果你对hf有了解的话），具体数据集代码可以参考paddlenlp/datasets/hf_datasets/docvqa_zh.py这个模块，在这个模块中，你可以看到_split_generators拆分数据集，可以看到通过download_manager 从 _URL = "https://bj.bcebos.com/paddlenlp/datasets/docvqa_zh.tar.gz"。我们没必要自己下载，当然也可以自己下载放到.cache/huggingface**目录下，但是这个下载文件是有hash编码的，我们直接放进去没用

Nov 11 '22 14:11 hehuang139

请问docvqa_zh这个数据集在哪里下载呢

百度下载速度也很快，不用担心

Nov 11 '22 14:11 hehuang139