Parsr icon indicating copy to clipboard operation
Parsr copied to clipboard

Bare-metal installation instructions need to be updated

Open slbayer opened this issue 3 years ago • 0 comments
trafficstars

Summary The bare metal installation instructions are out of date and incomplete.

Additional context The bare metal installation instructions on the page docs/installation.md needs to be updated. The issues I note are:

  • The documentation implies that PyPDF2 requires Python 2.7, but that appears not to be the case; Python 3.7 - 3.9, at least, all run PyPDF2 without problems.
  • The header detector uses a model which requires scikit-learn, but that package isn't on the list of things to install in the bare metal instructions. The version of scikit-learn that's required is 0.21.3; that's the version that the model was pickled with, and the current version no longer has the same symbols.
  • The TableDetection2Script.py script requires tabula, but that package isn't on the list of things to install in the bare metal instructions.
  • There's a copy-paste error in the MacOS version of the instructions: (pdfminer and camelot) is a parenthetical under both the Python 2 and Python 3 dependency instructions, and it can't be both. In fact, there are no longer any Python 2 dependencies, as far as I can tell.

In fact, because there are no Python 2 dependencies, it's now possible to install into a virtual environment and run from there, at least on the Mac.

Also, FYI, I used macports instead of brew and built my Python 3 virtual environment using a Python 3 download from python.org rather than a macports python, and everything seems to work just fine, which was a very pleasant surprise.

slbayer avatar Feb 23 '22 18:02 slbayer