pdf-struct
pdf-struct copied to clipboard
PIP Module install problem with SentencePieces dependency...
I am getting a error on pip install related to sentencepiece. The first error was the need for pkg-config, I installed it by brew. But there are additional errors related to SentencePiece, may be version changes in SentencePieces--it looks to be in the sentencepiece cpp wrapper.
I was able to install the sentencepiece module on its own. But, the pdf-struct module install breaks on its attempt to install sentencepiece. Looks like I may have already had it installed for sentence-transformers module also.
I used a devcontainer instance to isolate everything and the problem I’m experiencing now is torch 1.9 with the latest Python version.
I see PyTorch has a >2 release. Anyone considering a upgrade to the pdf-struct dependencies want to work together?
Getting closer... Hopefully.
I have it isolated to a Pickle Joblib issue when running predict. (The pdf_struct module installs without issue with Python 3.8.16 if I bring Rust in also to the container for building from source)
I have installed Python 3.8.5 using pyenv in a container; and I'm having the same issue with the model and pickle (as under 3.8.16). I have likely reached the end of my attempts, unless I dig deeper into how-to retrain and rebuild the model.
If someone would like to experience the problem I have released a devcontainer that you can experience the issue here: https://github.com/petegordon/pdf_struct_example