I'm having this same issue still:
https://github.com/kieferk/dfply/issues/8
-I am using conda to install dfply (which I need to because that's the package manager used by the computing cluster I have access to).
conda install -c tallic dfply
That's the command I use to install the package from https://anaconda.org/tallic/dfply.
But when I go to use dfply, it still says the diamonds.csv data is missing.
Traceback (most recent call last):
File "ACH_nested_anova.py", line 1, in
import dfply
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/init.py", line 11, in
from .data import diamonds
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/init.py", line 5, in
diamonds = pd.read_csv(os.path.join(root, "diamonds.csv"))
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1853, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv' does not exist: b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv'
2019-03-15 13:25:11 ⌚ gateway-03 in ~/ACH_Development/ACH_tests/ACH_quiz3/python_scripts/Analysis
○ → python ACH_nested_anova.py
Traceback (most recent call last):
File "ACH_nested_anova.py", line 2, in
from dfply import group_by as group_by, summarize as summarize, select as select
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/init.py", line 11, in
from .data import diamonds
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/init.py", line 5, in
diamonds = pd.read_csv(os.path.join(root, "diamonds.csv"))
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1853, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv' does not exist: b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv'
2019-03-15 13:25:41 ⌚ gateway-03 in ~/ACH_Development/ACH_tests/ACH_quiz3/python_scripts/Analysis
○ → pip install dfply
Requirement already satisfied: dfply in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (0.3.1)
Requirement already satisfied: numpy in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from dfply) (1.16.2)
Requirement already satisfied: pandas in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from dfply) (0.24.2)
Requirement already satisfied: python-dateutil>=2.5.0 in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from pandas->dfply) (2.8.0)
Requirement already satisfied: pytz>=2011k in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from pandas->dfply) (2018.9)
Requirement already satisfied: six>=1.5 in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from python-dateutil>=2.5.0->pandas->dfply) (1.12.0)
2019-03-15 13:26:59 ⌚ gateway-03 in ~/ACH_Development/ACH_tests/ACH_quiz3/python_scripts/Analysis
○ → python ACH_nested_anova.py
Traceback (most recent call last):
File "ACH_nested_anova.py", line 2, in
from dfply import group_by as group_by, summarize as summarize, select as select
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/init.py", line 11, in
from .data import diamonds
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/init.py", line 5, in
diamonds = pd.read_csv(os.path.join(root, "diamonds.csv"))
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1853, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv' does not exist: b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv'
I can substitute the import line with any of the following and the result is still the same:
-import dfply
-from dfply import group_by as group_by, summarize as summarize, select as select
-from dfply import *
Please help. I cannot seem to use git or pip to correct the problem. Pip tells me the package is already installed, but I get the same problem. Git is not available to me.
I'm not 100% sure, but I guess this issue comes from building from source. If you pip download --no-binary :all: --no-dependencies dfply
you'll find the same issue, the diamonds.csv
file is missing from dfply/data/
folder. However downloading the wheel pip download --no-dependencies dfply
, if you inspect the wheel you'll find that the diamonds.csv file is there.
I don't know anything about conda package management, but perhaps they take the result of python setup.py sdist
, which would omit the data file. According to this random SO post, a MANIFEST file should fix things.
https://stackoverflow.com/questions/7522250/how-to-include-package-data-with-setuptools-distribute
I ran into the same error with file missing of diamonds when dfply
library was installed using conda (conda install -c tallic dfply
). In order to resolve this remove library and its dependencies using conda.
Then install using pip under same conda environment.
library installed with pip install
works