mindsdb icon indicating copy to clipboard operation
mindsdb copied to clipboard

[Bug]: Getting error on full data analysis

Open rparthas opened this issue 2 years ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

I have been running mindsdb in docker and i imported this dataset https://www.kaggle.com/datasets/eishahassan/large-scale-manufacturing-industries-production into mindsdb

I added the dataset as file import and query via mindsdb web ui SELECT * FROM files.large_scale_industries ;

Once results are retrieved, I click on Data Insights and click on Full Data Analysis It shows Analysis in progress but never completes. It does not indicate distribution for two columns - Product & Unit of Quantity

I have replaced column names to include _ instead of space

I see following error

2022-11-27 15:38:47,921 - ERROR - http exception: index 0 is out of bounds for axis 0 with size 0 Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/opt/conda/lib/python3.7/site-packages/flask_restx/api.py", line 403, in wrapper resp = resource(*args, **kwargs) File "/opt/conda/lib/python3.7/site-packages/flask/views.py", line 89, in view return self.dispatch_request(*args, **kwargs) File "/opt/conda/lib/python3.7/site-packages/flask_restx/resource.py", line 49, in dispatch_request resp = meth(*args, **kwargs) File "/opt/conda/lib/python3.7/site-packages/mindsdb/api/http/namespaces/analysis.py", line 96, in post data_frame=DataFrame(data, columns=column_names) File "/opt/conda/lib/python3.7/site-packages/mindsdb/integrations/handlers/lightwood_handler/lightwood_handler/lightwood_handler.py", line 187, in analyze_dataset analysis = lightwood.analyze_dataset(data_frame) File "/opt/conda/lib/python3.7/site-packages/lightwood/api/high_level.py", line 125, in analyze_dataset problem_definition = ProblemDefinition.from_dict({'target': str(df.columns[0])}) File "/opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 4604, in __getitem__ return getitem(key) IndexError: index 0 is out of bounds for axis 0 with size 0 2022-11-27 15:38:47,944 - ERROR - [2022-11-27 15:38:47,943] ERROR in app: Exception on /api/analysis/data [POST] Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request()

Expected Behavior

Ideally analysis has to complete without errors and distribution is shown for all columns. It should show 7728/7728 Rows complete.

Steps To Reproduce

1. Run mindsdb as docker image `docker run -p 47334:47334 -p 47335:47335 mindsdb/mindsdb`
2. go to localhost:47334 and upload the dataset via import file
3. Give a suitable table name and execute select on the table
4. Do full data analysis

Anything else?

No response

rparthas avatar Nov 27 '22 15:11 rparthas

Hi, thank you for reporting this. we will be trying to replicate it shortly.

Ricram2 avatar Nov 28 '22 06:11 Ricram2

@rparthas Did you pulled the latest docker image?

ZoranPandovski avatar Dec 05 '22 17:12 ZoranPandovski

I am using mindsdb/mindsdb:22.10.2.1 which was updated 2 months ago. I guess when I started the work that was the latest. I can try pulling in the version which was updated 7 days ago

rparthas avatar Dec 05 '22 17:12 rparthas

@ZoranPandovski Tried with the latest tag. Now so far no error has come. Will observe some time and close if the analysis completes

rparthas avatar Dec 05 '22 18:12 rparthas

No error but still does not complete analysis for all 3 columns

rparthas avatar Dec 05 '22 18:12 rparthas

Hi @rparthas, I investigated this and it works okay from what I can gather (see attached image), the analysis takes a couple of seconds to complete.

If you are still having this issue, feel free to reopen, but I'll close this for the time being.

Image

paxcema avatar Jan 17 '23 20:01 paxcema