[Question]: 模型微调后怎么使用
请问 模型ernie_layout,做doc_vqa的模型微调,训练完成后(由于整体的训练时间太长,我就训练了1epoch,看下效果,并且现在是40000的checkpoint),我要怎么用这个微调的模型就行预测?
- 文档中没有告诉我怎么直接通过微调的模型进行预测,即指定checkpoint进行预测,我只看到do_train的时候可以从checkpoint出重新开始训练,但是do_predict的时候,并不会使用output_dir/checkpoint的模型?如果我指定--do_predict要用什么参数进行预测,需要怎么改,帮忙提供下
- 是否一定要将动态图转换成静态部署之后才能预测?正常应该不需要吧,但是没有看到文档,是否可以提供下文档说明
- 同时我也尝试export_model,但是报错 这是我的output_dir,是否一定要训练完成才能export,才能预测?
├── ernie-layoutx-base-uncased │ └── models │ └── docvqa_zh │ ├── checkpoint-30000 │ │ ├── model_config.json │ │ ├── model_state.pdparams │ │ ├── optimizer.pdopt │ │ ├── rng_state.pth │ │ ├── scheduler.pdparams │ │ ├── sentencepiece.bpe.model │ │ ├── special_tokens_map.json │ │ ├── tokenizer_config.json │ │ ├── trainer_state.json │ │ ├── training_args.bin │ │ └── vocab.txt │ ├── checkpoint-40000 │ │ ├── model_config.json │ │ ├── model_state.pdparams │ │ ├── optimizer.pdopt │ │ ├── rng_state.pth │ │ ├── scheduler.pdparams │ │ ├── sentencepiece.bpe.model │ │ ├── special_tokens_map.json │ │ ├── tokenizer_config.json │ │ ├── trainer_state.json │ │ ├── training_args.bin │ │ └── vocab.txt │ ├── eval_golden_labels.json │ ├── eval_predictions.json │ └── runs │
/usr/bin/env /home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/bin/python /home/hehuang/.vscode/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/launcher 38467 -- /home/hehuang/dev/git/PaddleNLP/model_zoo/ernie-layout/export_model.py --model_path ./ernie-layoutx-base-uncased/models/docvqa_zh/ --task_type mrc --output_path ./mrc_export [2022-10-27 15:49:56,644] [ INFO] - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/models/community/./ernie-layoutx-base-uncased/models/docvqa_zh/model_config.json [2022-10-27 15:49:56,871] [ ERROR] - Downloading from https://bj.bcebos.com/paddlenlp/models/community/./ernie-layoutx-base-uncased/models/docvqa_zh/model_config.json failed with code 404! Traceback (most recent call last): File "/home/hehuang/dev/git/PaddleNLP/paddlenlp/transformers/auto/modeling.py", line 278, in _from_pretrained resolved_vocab_file = get_path_from_url(community_config_path, File "/home/hehuang/dev/git/PaddleNLP/paddlenlp/utils/downloader.py", line 164, in get_path_from_url fullpath = _download(url, root_dir, md5sum) File "/home/hehuang/dev/git/PaddleNLP/paddlenlp/utils/downloader.py", line 200, in _download raise RuntimeError("Downloading from {} failed with code " RuntimeError: Downloading from https://bj.bcebos.com/paddlenlp/models/community/./ernie-layoutx-base-uncased/models/docvqa_zh/model_config.json failed with code 404!
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/hehuang/dev/env/anaconda3/envs/paddlenlp-gpu/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/hehuang/.vscode/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/main.py", line 45, in
- a correct model-identifier of built-in pretrained models,
- or a correct model-identifier of community-contributed pretrained models,
- or the correct path to a directory containing relevant modeling files(model_weights and model_config)
看了下源码,发现每个save_steps会先保存到checkpoint中,checkpoint也是存储模型,也是通过model.save方式保存的,只有最后完成训练后,比较最优才保存到output_dir中。即只有训练完成才能导出模型。
如果想要用checkpoint的模型也是可以的,只要制定下本地模型路径到checkpoint的路径即可
也就是如果想要使用checkpoint的微调模型,参数是要修改的model_name_or_path指向本地的模型checkpoint路径
python3 -u run_mrc.py
--model_name_or_path ./ernie-layoutx-base-uncased/models/docvqa_zh/checkpoint-40000
--output_dir ./predict_result
--dataset_name docvqa_zh
--do_predict
--lang "ch"
--num_train_epochs 6
--lr_scheduler_type linear
--warmup_ratio 0.05
--weight_decay 0
--pattern "mrc"
--use_segment_box false
--return_entity_level_metrics false
--overwrite_cache false
--doc_stride 128
--target_size 1000
--per_device_train_batch_size 8
--per_device_eval_batch_size 8
--learning_rate 2e-5
--preprocessing_num_workers 32
--save_total_limit 1
--train_nshard 16
--seed 1000
--metric_for_best_model anls
--greater_is_better true
通过checkpoint导出也是可以的,就是将model_path指向checkpoint的路径即可
请问docvqa_zh这个数据集在哪里下载呢
请问docvqa_zh这个数据集在哪里下载呢
跑阅读理解run_mrc.py就会自动下载,这个机制是paddle的datasets机制的_split_generators(其实和huggingface的datasets一摸一样的,如果你对hf有了解的话),具体数据集代码可以参考paddlenlp/datasets/hf_datasets/docvqa_zh.py这个模块,在这个模块中,你可以看到_split_generators拆分数据集,可以看到通过download_manager 从 _URL = "https://bj.bcebos.com/paddlenlp/datasets/docvqa_zh.tar.gz"。我们没必要自己下载,当然也可以自己下载放到.cache/huggingface**目录下,但是这个下载文件是有hash编码的,我们直接放进去没用
请问docvqa_zh这个数据集在哪里下载呢
百度下载速度也很快,不用担心