resume_parser icon indicating copy to clipboard operation
resume_parser copied to clipboard

Can't read config.cfg

Open Bhavin1996 opened this issue 4 years ago • 18 comments

OSError: [E053] Could not read config.cfg from C:\Users\bhavi\AppData\Local\Programs\Python\Python39\lib\site-packages\resume_parser\degree\model\config.cfg

Bhavin1996 avatar Feb 06 '21 23:02 Bhavin1996

Hey , make sure you have installed correct spacy==2.3.5 and en_core_web_sm==2.3.1 version. https://colab.research.google.com/drive/1p6rhi9g0ughtGBojnCJcPVRRNqziuk3K?usp=sharing see this colab notebook.

kbrajwani avatar Feb 07 '21 02:02 kbrajwani

I have run python -m spacy validate and confirmed that spacy version 2.3.5 and en_core_web_sm version is 2.3.1 When I run from resumeparser run resumeparse I get user warning [w031] message that says Model 'en_training' (0.0.0) requires spacy 2.2 and is incompatible with spacy 2.3.5

diracsol avatar Feb 21 '21 13:02 diracsol

Hey , make sure you have installed correct spacy==2.3.5 and en_core_web_sm==2.3.1 version. https://colab.research.google.com/drive/1p6rhi9g0ughtGBojnCJcPVRRNqziuk3K?usp=sharing see this colab notebook.

I too encounter this issue. Yes it works fine in Colab, along with some warning but when I run on my Ubuntu server, with the warning it get struck.

Jeyandranath avatar Mar 20 '21 17:03 Jeyandranath

Hey @Jeyandranath , can you please share some logs from where the process stuck. Also can you share the resume on which it stuck.

kbrajwani avatar Mar 20 '21 18:03 kbrajwani

Hey , make sure you have installed correct spacy==2.3.5 and en_core_web_sm==2.3.1 version. https://colab.research.google.com/drive/1p6rhi9g0ughtGBojnCJcPVRRNqziuk3K?usp=sharing see this colab notebook.

I too encounter this issue. Yes it works fine in Colab, along with some warning but when I run on my Ubuntu server, with the warning it get struck.

Tested in Windows, Works fine with the warning below : UserWarning: [W031] Model 'en_training' (0.0.0) requires spaCy v2.2 and is incompatible with the current spaCy version (2.3.5). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate warnings.warn(warn_msg)

data = resumeparse.read_file('hello.pdf') 2021-03-21 00:40:45,448 [MainThread ] [INFO ] Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server/1.24/tika-server-1.24.jar to C:\Users\CHARUJ~1\AppData\Local\Temp\tika-server.jar. 2021-03-21 00:41:16,323 [MainThread ] [INFO ] Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server/1.24/tika-server-1.24.jar.md5 to C:\Users\CHARUJ~1\AppData\Local\Temp\tika-server.jar.md5. 2021-03-21 00:41:19,471 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2021-03-21 00:41:24,493 [MainThread ] [WARNI] Failed to see startup log message; retrying... print(data) {'email': '[email protected]', 'phone': '+91-98845-92980', 'name': 'SHRAVAN KUMAR', 'total_exp': 4, 'university': [], 'designition': ['finance analyst', 'operations tech', 'deputy manager'], 'degree': ['B.Com Degree'], 'skills': ['Known: Tamil', ' English', ' and Tulu', 'Present Address: 22 Vijayalakshmi Avenue', 'Poonamallee', ' Chennai-56'], 'Companies worked at': ['92980', 'SAP', 'Hyundai Motor India Ltd', 'Hyundai Motor India Ltd.']}

Jeyandranath avatar Mar 20 '21 19:03 Jeyandranath

Hey @Jeyandranath , can you please share some logs from where the process stuck. Also can you share the resume on which it stuck. After this Warning in Ubuntu: hello.pdf

UserWarning: [W031] Model 'en_training' (0.0.0) requires spaCy v2.2 and is incompatible with the current spaCy version (2.3.5). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate warnings.warn(warn_msg)

Jeyandranath avatar Mar 20 '21 19:03 Jeyandranath

I think Java is the issue...

Jeyandranath avatar Mar 20 '21 19:03 Jeyandranath

There is no file in the path resume_parser\degree\model\ called config.cfg - even on the github repository. What are the contents of the config.cfg?

RohitJacob avatar Mar 22 '21 09:03 RohitJacob

Yep, same problem here within a Python 3.8 virtual environment (I followed the official installation instructions from here):

>>> from resume_parser import resumeparse
/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py:715: UserWarning: [W094] Model 'en_training' (0.0.0) specifies an under-constrained spaCy version requirement: >=2.2.4. This can lead to compatibility problems with older versions, or as new spaCy versions are released, because the model may say it's compatible when it's not. Consider changing the "spacy_version" in your meta.json to a version range, with a lower and upper pin. For example: >=3.0.5,<3.1.0
  warnings.warn(warn_msg)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/__init__.py", line 1, in <module>
    from resume_parser.resumeparse import resumeparse
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/resumeparse.py", line 50, in <module>
    custom_nlp2 = spacy.load(os.path.join(base_path,"degree","model"))
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/__init__.py", line 47, in load
    return util.load_model(name, disable=disable, exclude=exclude, config=config)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 324, in load_model
    return load_model_from_path(Path(name), **kwargs)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 388, in load_model_from_path
    config = load_config(config_path, overrides=dict_to_dot(config))
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 545, in load_config
    raise IOError(Errors.E053.format(path=config_path, name="config.cfg"))
OSError: [E053] Could not read config.cfg from /home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/degree/model/config.cfg

That config file does not actually exist in that position, but if it is located in another position, I can move it there. Where it is and what should it contain?

GuidoBartoli avatar Apr 16 '21 09:04 GuidoBartoli

After some experiments, I managed to find the config.cfg file inside my virtual environment (it was located inside ~/.virtualenvs/rsm/lib/python3.8/site-packages/en_core_web_sm/en_core_web_sm-3.0.0), so I copied it to the folder required by resume_parser, so the previous error was solved, but another one appears:

>>> from resume_parser import resumeparse
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/__init__.py", line 1, in <module>
    from resume_parser.resumeparse import resumeparse
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/resumeparse.py", line 50, in <module>
    custom_nlp2 = spacy.load(os.path.join(base_path,"degree","model"))
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/__init__.py", line 47, in load
    return util.load_model(name, disable=disable, exclude=exclude, config=config)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 324, in load_model
    return load_model_from_path(Path(name), **kwargs)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 390, in load_model_from_path
    return nlp.from_disk(model_path, exclude=exclude)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/language.py", line 1863, in from_disk
    util.from_disk(path, deserializers, exclude)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 1174, in from_disk
    reader(path / key)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/language.py", line 1849, in <lambda>
    deserializers["tokenizer"] = lambda p: self.tokenizer.from_disk(
  File "spacy/tokenizer.pyx", line 740, in spacy.tokenizer.Tokenizer.from_disk
  File "spacy/tokenizer.pyx", line 803, in spacy.tokenizer.Tokenizer.from_bytes
  File "spacy/tokenizer.pyx", line 570, in spacy.tokenizer.Tokenizer._load_special_cases
  File "spacy/tokenizer.pyx", line 589, in spacy.tokenizer.Tokenizer._validate_special_case
ValueError: [E1005] Unable to set attribute 'POS' in tokenizer exception for '	'. Tokenizer exceptions are only allowed to specify ORTH and NORM.

This is harder to understand... do you have any suggestions?

GuidoBartoli avatar Apr 16 '21 09:04 GuidoBartoli

Hey please make sure your requirements are matched like this spacy==2.3.5 and en_core_web_sm==2.3.1 . config.cfg is spacy configuration file it will be downloaded when we install en_core_web_sm package. I will try to update model as i get some time. Thanks

kbrajwani avatar Apr 16 '21 16:04 kbrajwani

I have the same problems like this and I've installed library following requirements but its doesn't work for me.

ranyaphat29 avatar Apr 20 '21 05:04 ranyaphat29

I have faced the same issue of runtime stuck while importing resume_parser (with spacy 2.3.5 and en_core_web_sm 2.3.1). Even the colab notebook also got stuck at same code execution. Could you fix this issue or let us know what is the reason for this issue?

bharath-ts avatar Apr 28 '21 12:04 bharath-ts

After some experiments, I managed to find the config.cfg file inside my virtual environment (it was located inside ~/.virtualenvs/rsm/lib/python3.8/site-packages/en_core_web_sm/en_core_web_sm-3.0.0), so I copied it to the folder required by resume_parser, so the previous error was solved, but another one appears:

>>> from resume_parser import resumeparse
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/__init__.py", line 1, in <module>
    from resume_parser.resumeparse import resumeparse
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/resumeparse.py", line 50, in <module>
    custom_nlp2 = spacy.load(os.path.join(base_path,"degree","model"))
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/__init__.py", line 47, in load
    return util.load_model(name, disable=disable, exclude=exclude, config=config)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 324, in load_model
    return load_model_from_path(Path(name), **kwargs)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 390, in load_model_from_path
    return nlp.from_disk(model_path, exclude=exclude)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/language.py", line 1863, in from_disk
    util.from_disk(path, deserializers, exclude)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 1174, in from_disk
    reader(path / key)
  File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/language.py", line 1849, in <lambda>
    deserializers["tokenizer"] = lambda p: self.tokenizer.from_disk(
  File "spacy/tokenizer.pyx", line 740, in spacy.tokenizer.Tokenizer.from_disk
  File "spacy/tokenizer.pyx", line 803, in spacy.tokenizer.Tokenizer.from_bytes
  File "spacy/tokenizer.pyx", line 570, in spacy.tokenizer.Tokenizer._load_special_cases
  File "spacy/tokenizer.pyx", line 589, in spacy.tokenizer.Tokenizer._validate_special_case
ValueError: [E1005] Unable to set attribute 'POS' in tokenizer exception for '	'. Tokenizer exceptions are only allowed to specify ORTH and NORM.

This is harder to understand... do you have any suggestions?

I have the same issue . Do you have any suggestions please?

1zineb avatar Apr 29 '21 01:04 1zineb

I have faced the same issue of runtime stuck while importing resume_parser (with spacy 2.3.5 and en_core_web_sm 2.3.1). Even the colab notebook also got stuck at same code execution. Could you fix this issue or let us know what is the reason for this issue?

i have also encounter this. can you please check in local by installing the same way installation done in colab. i will solve it as i will get time.

kbrajwani avatar Apr 30 '21 05:04 kbrajwani

Hey guys, i have solved it in colab notebook . If you want to install it in local please follow the steps below.

  1. Create a new python environment https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/
  2. Install library pip install resume-parser
  3. Install en_core_web_sm pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz
  4. Install importlib-metadata pip install importlib-metadata==3.2.0

Now you can use the library.

kbrajwani avatar May 01 '21 03:05 kbrajwani

I had some issues to understand correctly the steps, so here are my additions to @kbrajwani -s comments.

  1. Follow his description
  2. From python you MUST execute the nltk.download() commands which will download the necessary data. This is something I totally missed.
  3. Install java on the machine. The library uses apache tika, which is written in java and is able to get the content from a pdf file very nicely so parsing will be more more efficient.
  4. Try to use python 3.8, I had some issues with 3.9 and 3.10
  5. Try to use linux. On windows, I had compilation issues.

sz332 avatar Dec 09 '21 21:12 sz332

Thanks @sz332 For sharing your experience.

kbrajwani avatar Dec 11 '21 11:12 kbrajwani