metacrafter icon indicating copy to clipboard operation
metacrafter copied to clipboard

Package not Self-Contained

Open superctj opened this issue 1 year ago • 9 comments

Thank you for open-sourcing this handy tool! I was trying to install the package from pip and source, but neither works out-of-the-box. From my end (Ubuntu with Python 3.10), running the command metacrafter scan-file --format short <file name> gives me the error: Traceback (most recent call last): File "~/miniconda3/envs/metacrafter/bin/metacrafter", line 33, in <module> sys.exit(load_entry_point('metacrafter==0.0.4', 'console_scripts', 'metacrafter')()) File "~/miniconda3/envs/metacrafter/lib/python3.10/site-packages/metacrafter-0.0.4-py3.10.egg/metacrafter/__main__.py", line 10, in main from .core import cli File "~/miniconda3/envs/metacrafter/lib/python3.10/site-packages/metacrafter-0.0.4-py3.10.egg/metacrafter/core.py", line 18, in <module> from iterable.helpers.detect import open_iterable ModuleNotFoundError: No module named 'iterable'

Even I installed iterabledata 1.0.5 from pip, I ran into another error: AttributeError: module 'snappy' has no attribute 'decompress'. Could you please look into the issue? Thanks in advance.

superctj avatar Jul 01 '24 03:07 superctj

@superctj Hi! Looks like I described more dependencies wrong in the package. I will fix it ASAP, thanks!

I think you need to install python-snappy with pip install python-snappy More info here https://stackoverflow.com/questions/48535799/module-snappy-has-no-attribute-decompress

ivbeg avatar Jul 01 '24 06:07 ivbeg

Fixed in main branch, will be updated in next package release

ivbeg avatar Jul 01 '24 12:07 ivbeg

Thank you @ivbeg for the quick action! I appreciate it.

superctj avatar Jul 01 '24 16:07 superctj

Hi @ivbeg again, FYI, when I installed the package from the main branch, I ran into ModuleNotFoundError: No module named 'Cython'. After I installed Cython, the installation completed but when running the file scan command, the AttributeError: module 'snappy' has no attribute 'decompress' popped up again. I did pip install python-snappy and it fixed the error. However, I got a parquet.ParquetFormatException: Unsupported encoding: RLE_DICTIONARY when scanning a parquet file. Do you have any idea?

superctj avatar Jul 01 '24 17:07 superctj

@superctj not yet, it's ok with almost all parquet files that I tested. Could you share this file please?

ivbeg avatar Jul 01 '24 18:07 ivbeg

Thank you for your quick response! GitHub does not support attaching parquet files so I put the sample file in Google Drive. Let me know if you cannot access the file.

superctj avatar Jul 01 '24 18:07 superctj

@superctj Thanks. I use pure Python parquet lib https://pypi.org/project/parquet/ to read parquet files since it provides simple iteration functions but looks like it doesn't support this type of encoding. I will take a look a bit later if I could easily replace it with pyarrow parquet reader

ivbeg avatar Jul 01 '24 19:07 ivbeg

@superctj Finally fixed, replaced parquet lib with pyarrow. The changes are in the iterabledata library, you need to reinstall it from main branch source code repository https://github.com/apicrafter/pyiterable

ivbeg avatar Jul 04 '24 13:07 ivbeg

Thank you @ivbeg for the quick action! I will probably give it a shot later.

superctj avatar Jul 05 '24 18:07 superctj