mindnlp
mindnlp copied to clipboard
离线方式执行分词异常
Describe the bug/ 问题描述 (Mandatory / 必填) T5Tokenizer.from_pretrained(strTokenizer,cache_dir=cache_dir,local_files_only=True ) 在from_pretrained方法中,设置local_files_only为True,在断网的情况下,依然访问外网请求文件;
-
Hardware Environment(
Ascend/GPU/CPU) / 硬件环境: windows的CPU环境 -
Software Environment / 软件环境 (Mandatory / 必填): -- MindSpore version (e.g., 1.7.0.Bxxx) :2.4.0 -- Python version (e.g., Python 3.7.5) :3.10
-
Excute Mode / 执行模式 (Mandatory / 必填)(
PyNative/Graph):
Please delete the mode not involved / 请删除不涉及的模式:
To Reproduce / 重现步骤 (Mandatory / 必填) Steps to reproduce the behavior:
- 使用python执行语句:T5Tokenizer.from_pretrained('google-t5/t5-small',cache_dir=cache_dir,local_files_only=True )
- 已经执行过一次;
- 将本地网络关闭;
- 再次执行报连接外网外网异常
Expected behavior / 预期结果 (Mandatory / 必填) A clear and concise description of what you expected to happen. 已经下载好的分词器文件,在离线情况下不用再下载; Screenshots/ 日志 / 截图 (Mandatory / 必填) If applicable, add screenshots to help explain your problem.
Traceback (most recent call last): File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\adapters.py", line 667, in send resp = conn.urlopen( File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 841, in urlopen retries = retries.increment( File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='hf-mirror.com', port=443): Max retries exceeded with url: /google-t5/t5-small/resolve/main/added_tokens.json?download=true (Caused by ProxyError('Unable to connect to proxy', FileNotFoundError(2, 'No such file or directory')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\mindnlp\utils\download.py", line 503, in cached_file resolved_file = download( File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\mindnlp\utils\download.py", line 627, in download pointer_path = http_get(url, storage_folder, download_file_name=relative_filename, proxies=proxies, headers=headers) File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\mindnlp\utils\download.py", line 198, in http_get req = requests.get(url, stream=True, timeout=10, proxies=proxies, headers=headers) File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 73, in get return request("get", url, params=params, **kwargs) File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 59, in request return session.request(method=method, url=url, **kwargs) File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "C:\Users\bluen\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\adapters.py", line 694, in send raise ProxyError(e, request=request) requests.exceptions.ProxyError: HTTPSConnectionPool(host='hf-mirror.com', port=443): Max retries exceeded with url: /google-t5/t5-small/resolve/main/added_tokens.json?download=true (Caused by ProxyError('Unable to connect to proxy', FileNotFoundError(2, 'No such file or directory')))
Additional context / 备注 (Optional / 选填) Add any other context about the problem here.
网络问题,需要切换mirror,或者科学上网
cache_dir没生效?