stackexchange-dump-to-postgres icon indicating copy to clipboard operation
stackexchange-dump-to-postgres copied to clipboard

Error when running python load_into_pg.py -d stackoverflow -s pgload

Open flfilip opened this issue 4 years ago • 3 comments

Hi Team,

Could you please help me with the below error when trying to run python load_into_pg.py? Am I missing something?

Traceback (most recent call last): File "load_into_pg.py", line 413, in import libarchive File "/home/pgadminuser/.local/lib/python2.7/site-packages/libarchive/init.py", line 1, in from .entry import ArchiveEntry File "/home/pgadminuser/.local/lib/python2.7/site-packages/libarchive/entry.py", line 6, in from . import ffi File "/home/pgadminuser/.local/lib/python2.7/site-packages/libarchive/ffi.py", line 108, in errno = ffi('errno', [c_archive_p], c_int) File "/home/pgadminuser/.local/lib/python2.7/site-packages/libarchive/ffi.py", line 95, in ffi f = getattr(libarchive, 'archive_'+name) File "/usr/lib/python2.7/ctypes/init.py", line 379, in getattr func = self.getitem(name) File "/usr/lib/python2.7/ctypes/init.py", line 384, in getitem func = self._FuncPtr((name_or_ordinal, self))

When running the below command it all seems ok.

pip install -r requirements.txt Collecting argparse==1.2.1 (from -r requirements.txt (line 1)) Collecting distribute==0.6.24 (from -r requirements.txt (line 2)) Collecting libarchive-c==2.9 (from -r requirements.txt (line 3)) Using cached https://files.pythonhosted.org/packages/23/16/622ae829e9c1795479df865bbcbb4e7e3990f3e451e440f00bf1615be7fc/libarchive_c-2.9-py2.py3-none-any.whl Collecting lxml==4.5.2 (from -r requirements.txt (line 4)) Using cached https://files.pythonhosted.org/packages/d1/2d/642ef7013aa56af52e14b5b7d53c5d591e6d038c9688e06d0f2a20ed26b2/lxml-4.5.2-cp27-cp27mu-manylinux1_x86_64.whl Collecting psycopg2-binary==2.8.4 (from -r requirements.txt (line 5)) Using cached https://files.pythonhosted.org/packages/97/2a/b854019bcb9b925cd10ff245dbc9448a82fe7fdb40127e5cf1733ad0765c/psycopg2_binary-2.8.4-cp27-cp27mu-manylinux1_x86_64.whl Collecting six==1.10.0 (from -r requirements.txt (line 6)) Using cached https://files.pythonhosted.org/packages/c8/0a/b6723e1bc4c516cb687841499455a8505b44607ab535be01091c0f24f079/six-1.10.0-py2.py3-none-any.whl Installing collected packages: argparse, distribute, libarchive-c, lxml, psycopg2-binary, six Successfully installed argparse-1.2.1 distribute-0.6.24 libarchive-c-2.9 lxml-4.6.2 psycopg2-binary-2.8.4 six-1.10.0

Thank you, Florin

flfilip avatar Apr 07 '21 01:04 flfilip

It looks like a libarchive installation issue; that it couldn't probably find archive_7z for loading. The first line of the Traceback should contain the exact error.

Are you trying to run this on Windows or OSX?

If so, getting it to run inside Docker maybe easier than trying to install the dependencies directly.

musically-ut avatar Apr 07 '21 07:04 musically-ut

Hi,

Thank you for your message. No, I am running inside ubuntu machine. Tried on multiple VMs, getting same error. Could you please provide me the steps on how to run it inside of a container? Florin

flfilip avatar Apr 07 '21 08:04 flfilip

Curious.

On your Ubuntu machine, can you try:

sudo apt-get install libarchive-dev

and then try running the script again? If that fails, it may be necessary to give an explicit path to libarchive*.so files via the LD_LIBRARY_PATH environment variable while running the script.

I personally do not run it in a container because I'm able to install all the dependencies.

musically-ut avatar Apr 10 '21 15:04 musically-ut