gpt_academic icon indicating copy to clipboard operation
gpt_academic copied to clipboard

[Bug]: NOUGAT精准翻译PDF异常

Open foundnom opened this issue 1 year ago • 2 comments

Installation Method | 安装方法与平台

Anaconda (I used latest requirements.txt)

Version | 版本

Latest | 最新版

OS | 操作系统

Windows

Describe the bug | 简述

提示NOUGAT解析论文失败

Screen Shot | 有帮助的截图

以下是汇报错误的页面截图 image

Terminal Traceback & Material to Help Reproduce Bugs | 终端traceback(如有) + 帮助我们复现的测试材料样本(如有)

终端Traceback:

WARNING:root:No GPU found. Conversion on CPU is very slow.
Traceback (most recent call last):
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\site-packages\transformers\configuration_utils.py", line 718, in _get_config_dict
    config_dict = cls._dict_from_json_file(resolved_config_file)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\site-packages\transformers\configuration_utils.py", line 817, in _dict_from_json_file
    return json.loads(text)
           ^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\ProgramData\anaconda3\envs\gptaca\Scripts\nougat.exe\__main__.py", line 7, in <module>
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\site-packages\predict.py", line 127, in main
    model = NougatModel.from_pretrained(args.checkpoint)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\site-packages\nougat\model.py", line 684, in from_pretrained
    model = super(NougatModel, cls).from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\site-packages\transformers\modeling_utils.py", line 2981, in from_pretrained
    config, model_kwargs = cls.config_class.from_pretrained(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\site-packages\transformers\configuration_utils.py", line 604, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\site-packages\transformers\configuration_utils.py", line 633, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\gptaca\Lib\site-packages\transformers\configuration_utils.py", line 721, in _get_config_dict
    raise EnvironmentError(
OSError: It looks like the config file at 'C:\Users\Found\.cache\torch\hub\nougat-0.1.0-small\config.json' is not a valid JSON file.

foundnom avatar May 23 '24 06:05 foundnom

同样的问题: [Local Message] 插件调用出错:

Traceback (most recent call last): File "d:\Program Files\whr\Python_Project\gpt_academic-master\toolbox.py", line 207, in decorated yield from f(main_input, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, *args, **kwargs) File "d:\Program Files\whr\Python_Project\gpt_academic-master\crazy_functions\批量翻译PDF文档_NOUGAT.py", line 93, in 批量翻译PDF文档 yield from 解析PDF_基于NOUGAT(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt) File "d:\Program Files\whr\Python_Project\gpt_academic-master\crazy_functions\批量翻译PDF文档_NOUGAT.py", line 111, in 解析PDF_基于NOUGAT fpp = yield from nougat_handle.NOUGAT_parse_pdf(fp, chatbot, history) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "d:\Program Files\whr\Python_Project\gpt_academic-master\crazy_functions\crazy_utils.py", line 600, in NOUGAT_parse_pdf raise RuntimeError("Nougat解析论文失败。") RuntimeError: Nougat解析论文失败。

ipc-robot avatar May 27 '24 10:05 ipc-robot

def nougat_with_timeout(self, command, cwd, timeout=1e100):
    import subprocess
    from toolbox import ProxyNetworkActivate
    logging.info(f'正在执行命令 {command}')
    with ProxyNetworkActivate("Nougat_Download"):
        process = subprocess.Popen(command, shell=False, cwd=cwd, env=os.environ)
    try:
        stdout, stderr = process.communicate(timeout=timeout)
    except subprocess.TimeoutExpired:
        process.kill()
        stdout, stderr = process.communicate()
        print("Process timed out!")
        return False
    return True

为什么一点运行就会"Process timed out!"?,不论timeout设置为多少,一运行就会timeout

ipc-robot avatar May 28 '24 02:05 ipc-robot