flair icon indicating copy to clipboard operation
flair copied to clipboard

[Question]: Installing `pyab3p`

Open jessicapetrochuk opened this issue 1 year ago • 2 comments

Question

When trying to run species_linker = EntityMentionLinker.load("species-linker") I am getting 'pyab3p' is not found, switching to a model without abbreviation resolution. This might impact the model performance. To reach full performance, please install pyab3p by running: pip install pyab3p.

When I try to run pip install pyab3p I get ERROR: Failed building wheel for pyab3p due to ab3p_source/Ab3P.cpp:1:10: fatal error: 'Ab3P.h' file not found. Is this a known issue and is there a known fix?

jessicapetrochuk avatar Aug 27 '24 12:08 jessicapetrochuk

Hello @jessicapetrochuk pyab3p does not work on all operating systems, which is why we included a version that does not require pyab3p.

alanakbik avatar Aug 27 '24 13:08 alanakbik

@alanakbik thanks - how does this impact the performance of the model?

jessicapetrochuk avatar Aug 27 '24 14:08 jessicapetrochuk

I think @sg-wbi or @mariosaenger can probably answer this question?

alanakbik avatar Aug 28 '24 16:08 alanakbik

I would say that the impact of abbreviation depends on the entity type, for species specifically it should be negligible, while for instance for disease resolving abbreviations can be critical (think of all the acronyms like "TBC").

If you want to read more about the topic I can recommend these two publications:

  • https://aclanthology.org/2023.emnlp-main.893/
  • https://academic.oup.com/bioinformatics/article/40/8/btae474/7721929

sg-wbi avatar Aug 28 '24 18:08 sg-wbi

Hi @jessicapetrochuk

can you please post the full stacktrace/error message/logs? Also can you please specify which python version and OS you are using?

helpmefindaname avatar Aug 29 '24 08:08 helpmefindaname

I'm on python 3.11.7 and macOS

Here is the full stack trace:

Collecting pyab3p
  Using cached pyab3p-0.1.0.tar.gz (35 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: pyab3p
  Building wheel for pyab3p (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for pyab3p (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [35 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-13-x86_64-cpython-311
      creating build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/__init__.py -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/hshset_Lf1chSf.nm -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/hshset_Lf1chSf.ha -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/cshset_wrdset3.ct -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/hshset_stop.str -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/Lf1chSf -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/Ab3P_prec.dat -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/cshset_wrdset3.str -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/cshset_wrdset3.ad -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/hshset_stop.nm -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/hshset_stop.ha -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/stop -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/hshset_Lf1chSf.ad -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/cshset_wrdset3.ha -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/hshset_stop.ad -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/cshset_wrdset3.nm -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/hshset_Lf1chSf.str -> build/lib.macosx-13-x86_64-cpython-311/word_data
      copying word_data/SingTermFreq.dat -> build/lib.macosx-13-x86_64-cpython-311/word_data
      running build_ext
      clang++ -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk "-I/Users/jessicapetrochuk/Library/Mobile Documents/com~apple~CloudDocs/Documents/Replifai/replifai/venv/include" -I/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11 -c flagcheck.cpp -o flagcheck.o -std=c++17
      building 'pyab3p' extension
      creating build/temp.macosx-13-x86_64-cpython-311
      creating build/temp.macosx-13-x86_64-cpython-311/ab3p_source
      clang++ -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk -I/private/var/folders/4j/z95plqpx4m73mjzln4_yjs_00000gn/T/pip-build-env-j5cp525f/overlay/lib/python3.11/site-packages/pybind11/include "-I/Users/jessicapetrochuk/Library/Mobile Documents/com~apple~CloudDocs/Documents/Replifai/replifai/venv/include" -I/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11 -c ab3p_source/Ab3P.cpp -o build/temp.macosx-13-x86_64-cpython-311/ab3p_source/Ab3P.o -std=c++17 -mmacosx-version-min=10.14 -fvisibility=hidden -g0 -w
      ab3p_source/Ab3P.cpp:1:10: fatal error: 'Ab3P.h' file not found
      #include "Ab3P.h"
               ^~~~~~~~
      1 error generated.
      error: command '/usr/bin/clang++' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pyab3p
Failed to build pyab3p
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (pyab3p)

jessicapetrochuk avatar Aug 29 '24 12:08 jessicapetrochuk

Okay, when I built the python-binding, I didn't manage to build for macosx as target, hence it isn't downloading a precompiled version. The missing "Ab3P.h" file is an error on my side tho. I have just released a new version that contains the missing header-files. You can try again trying to install pyab3p==0.1.1, however from my experience I would guess that importing pyab3p will cause the python script to crash on macosx.

helpmefindaname avatar Aug 29 '24 19:08 helpmefindaname

The pip install and the import both actually work for me after that update - thanks!

jessicapetrochuk avatar Aug 29 '24 23:08 jessicapetrochuk

Wow that is great, I suppose that means I can close this issue. Feel free to repoen if there is any followup required

helpmefindaname avatar Aug 30 '24 09:08 helpmefindaname