[Question]: 我需要在离线环境下部署pp-uie模型,我把训练好的模型放到了uer/.paddlenlp/model/paddlenlp/PP-uie-7B,但是每次信息抽取仍然需要下载shards,我应该如何解决呢?
请提出你的问题
我想要在离线环境使用pp-uie模型,但是每次信息抽取都需要下载shards
这里可以离线下载 https://paddlenlp.readthedocs.io/zh/latest/model_list.html
这里可以离线下载 https://paddlenlp.readthedocs.io/zh/latest/model_list.html
不好意思,我看到这个地址已经无法使用了,而且如果离线下载之后我应该如何使用它呢.
sorry 最近更新了一下。 https://paddlenlp.readthedocs.io/zh/latest/website/index.html
sorry 最近更新了一下。 https://paddlenlp.readthedocs.io/zh/latest/website/index.html
可能你误解我的意思了,我是使用了我自己的模型Taskflow('information_extraction', schema=self.schema, model='paddlenlp/PP-UIE-0.5B',precision='float32'),我将我训练好的模型放到了.paddlenlp来使用我已经训练好的模型,但是当我离线使用时,会报错,
[2025-04-27 10:01:23,559] [ INFO] - The unk_token parameter needs to be defined: we use eos_token by default.
[2025-04-27 10:01:23,764] [ INFO] - We are using <class 'paddlenlp.transformers.qwen2.modeling.Qwen2ForCausalLM'> to load 'paddlenlp/PP-UIE-0.5B'.
[2025-04-27 10:01:23,764] [ INFO] - Loading configuration file C:\Users\lenovo.paddlenlp\models\paddlenlp/PP-UIE-0.5B\config.json
[2025-04-27 10:01:23,765] [ INFO] - Loading weights file from cache at C:\Users\lenovo.paddlenlp\models\paddlenlp/PP-UIE-0.5B\model.safetensors.index.json
Downloading shards: 100%|██████████| 1/1 [00:00<00:00, 1763.79it/s]
[2025-04-27 10:02:18,967] [ INFO] - All model checkpoint weights were used when initializing Qwen2ForCausalLM.
[2025-04-27 10:02:18,967] [ INFO] - All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at paddlenlp/PP-UIE-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
[2025-04-27 10:02:18,975] [ INFO] - Loading configuration file C:\Users\lenovo.paddlenlp\models\paddlenlp/PP-UIE-0.5B\generation_config.json
我怀疑是否是因为Downloading shards: 100%|██████████| 1/1 [00:00<00:00, 1763.79it/s]这条日志,每次使用nlp都需要下载shards才导致没有网的时候报错
sorry 最近更新了一下。 https://paddlenlp.readthedocs.io/zh/latest/website/index.html
以下是我离线时,他的报错信息:
Traceback (most recent call last):
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 464, in _make_request
self._validate_conn(conn)
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 1093, in _validate_conn
conn.connect()
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connection.py", line 741, in connect
sock_and_verified = _ssl_wrap_socket_and_match_hostname(
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connection.py", line 920, in _ssl_wrap_socket_and_match_hostname
ssl_sock = ssl_wrap_socket(
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\util\ssl_.py", line 460, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\util\ssl_.py", line 504, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "D:\anaconda\envs\my_nlp\lib\ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "D:\anaconda\envs\my_nlp\lib\ssl.py", line 1104, in _create
self.do_handshake()
File "D:\anaconda\envs\my_nlp\lib\ssl.py", line 1375, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
response = self._make_request(
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 488, in _make_request
raise new_e
urllib3.exceptions.SSLError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\anaconda\envs\my_nlp\lib\site-packages\requests\adapters.py", line 667, in send
resp = conn.urlopen(
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 841, in urlopen
retries = retries.increment(
File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\util\retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/paddlenlp/PP-UIE-0.5B/chat_template.json (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\__init__.py", line 238, in resolve_file_path
is_available = bos_aistudio_hf_file_exist(
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\__init__.py", line 331, in bos_aistudio_hf_file_exist
out = bos_file_exists(
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\bos_download.py", line 269, in bos_file_exists
get_bos_file_metadata(url, token=token)
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\bos_download.py", line 100, in get_bos_file_metadata
r = _request_wrapper(
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\common.py", line 346, in _request_wrapper
response = _request_wrapper(
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\common.py", line 368, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "D:\anaconda\envs\my_nlp\lib\site-packages\requests\sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "D:\anaconda\envs\my_nlp\lib\site-packages\requests\sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "D:\anaconda\envs\my_nlp\lib\site-packages\requests\adapters.py", line 698, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/paddlenlp/PP-UIE-0.5B/chat_template.json (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:\projects\contract_nlp_project\utils\pdf2txt.py", line 243, in <module>
a = File2Txt(r'F:\projects\contract_nlp_project\utils\2024-1.pdf', r'F:\projects\contract_nlp_project\utils')
File "F:\projects\contract_nlp_project\utils\pdf2txt.py", line 38, in __init__
self.ie = Taskflow('information_extraction', schema=self.schema, model='paddlenlp/PP-UIE-0.5B',
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\taskflow\taskflow.py", line 869, in __init__
self.task_instance = task_class(
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\taskflow\information_extraction.py", line 150, in __init__
self._construct_tokenizer(model)
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\taskflow\information_extraction.py", line 178, in _construct_tokenizer
self._tokenizer = AutoTokenizer.from_pretrained(model)
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\transformers\auto\tokenizer.py", line 478, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\transformers\tokenizer_utils.py", line 835, in from_pretrained
tokenizer, tokenizer_config_file_dir = super().from_pretrained(pretrained_model_name_or_path, *args, **kwargs)
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\transformers\tokenizer_utils_base.py", line 1590, in from_pretrained
resolved_vocab_files[file_id] = resolve_file_path(
File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\__init__.py", line 286, in resolve_file_path
raise EnvironmentError(
OSError: Can't load the model for 'paddlenlp/PP-UIE-0.5B'. If you were trying to load it from 'BOS', make sure you don't have a local directory with the same name. Otherwise, make sure 'paddlenlp/PP-UIE-0.5B' is the correct path to a directory containing one of the ['chat_template.json']
你好,你本地造一个 空的 chat_template.json 文件试一试?
请问这个问题解决了吗,我现在也在离线情况下遇到了类似的问题
sorry 最近更新了一下。 https://paddlenlp.readthedocs.io/zh/latest/website/index.html
以下是我离线时,他的报错信息:
Traceback (most recent call last): File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 464, in _make_request self._validate_conn(conn) File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 1093, in _validate_conn conn.connect() File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connection.py", line 741, in connect sock_and_verified = _ssl_wrap_socket_and_match_hostname( File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connection.py", line 920, in ssl_wrap_socket_and_match_hostname ssl_sock = ssl_wrap_socket( File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\util\ssl.py", line 460, in ssl_wrap_socket ssl_sock = ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname) File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\util\ssl.py", line 504, in _ssl_wrap_socket_impl return ssl_context.wrap_socket(sock, server_hostname=server_hostname) File "D:\anaconda\envs\my_nlp\lib\ssl.py", line 513, in wrap_socket return self.sslsocket_class._create( File "D:\anaconda\envs\my_nlp\lib\ssl.py", line 1104, in _create self.do_handshake() File "D:\anaconda\envs\my_nlp\lib\ssl.py", line 1375, in do_handshake self._sslobj.do_handshake() ssl.SSLEOFError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007) During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen response = self._make_request( File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 488, in _make_request raise new_e urllib3.exceptions.SSLError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "D:\anaconda\envs\my_nlp\lib\site-packages\requests\adapters.py", line 667, in send resp = conn.urlopen( File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\connectionpool.py", line 841, in urlopen retries = retries.increment( File "D:\anaconda\envs\my_nlp\lib\site-packages\urllib3\util\retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/paddlenlp/PP-UIE-0.5B/chat_template.json (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download_init_.py", line 238, in resolve_file_path is_available = bos_aistudio_hf_file_exist( File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download_init_.py", line 331, in bos_aistudio_hf_file_exist out = bos_file_exists( File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\bos_download.py", line 269, in bos_file_exists get_bos_file_metadata(url, token=token) File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\bos_download.py", line 100, in get_bos_file_metadata r = _request_wrapper( File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\common.py", line 346, in _request_wrapper response = _request_wrapper( File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download\common.py", line 368, in _request_wrapper response = get_session().request(method=method, url=url, **params) File "D:\anaconda\envs\my_nlp\lib\site-packages\requests\sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) File "D:\anaconda\envs\my_nlp\lib\site-packages\requests\sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "D:\anaconda\envs\my_nlp\lib\site-packages\requests\adapters.py", line 698, in send raise SSLError(e, request=request) requests.exceptions.SSLError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/paddlenlp/PP-UIE-0.5B/chat_template.json (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "F:\projects\contract_nlp_project\utils\pdf2txt.py", line 243, in
a = File2Txt(r'F:\projects\contract_nlp_project\utils\2024-1.pdf', r'F:\projects\contract_nlp_project\utils') File "F:\projects\contract_nlp_project\utils\pdf2txt.py", line 38, in init self.ie = Taskflow('information_extraction', schema=self.schema, model='paddlenlp/PP-UIE-0.5B', File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\taskflow\taskflow.py", line 869, in init self.task_instance = task_class( File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\taskflow\information_extraction.py", line 150, in init self._construct_tokenizer(model) File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\taskflow\information_extraction.py", line 178, in _construct_tokenizer self.tokenizer = AutoTokenizer.from_pretrained(model) File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\transformers\auto\tokenizer.py", line 478, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs) File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\transformers\tokenizer_utils.py", line 835, in from_pretrained tokenizer, tokenizer_config_file_dir = super().from_pretrained(pretrained_model_name_or_path, *args, **kwargs) File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\transformers\tokenizer_utils_base.py", line 1590, in from_pretrained resolved_vocab_files[file_id] = resolve_file_path( File "D:\anaconda\envs\my_nlp\lib\site-packages\paddlenlp\utils\download_init.py", line 286, in resolve_file_path raise EnvironmentError( OSError: Can't load the model for 'paddlenlp/PP-UIE-0.5B'. If you were trying to load it from 'BOS', make sure you don't have a local directory with the same name. Otherwise, make sure 'paddlenlp/PP-UIE-0.5B' is the correct path to a directory containing one of the ['chat_template.json']
请问这个问题解决了吗,我目前在离线环境也遇到了类似的问题
确实,本地造一个空的chat_template.json 文件就好了
This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。
This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。
确实,本地造一个空的chat_template.json 文件就好了
这个空文件要放在哪个位置呢,我放在了模型下边(PP-UIE-1.5B)下, 运行还会报错 File "/usr/local/lib/python3.12/json/decoder.py", line 356, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)