ydata-profiling
ydata-profiling copied to clipboard
Mixed column stops at "Get scatter matrix"
I have a dataset of about 250K rows and 28 columns. I run the profiler and it stops at 83% "Get scatter matrix" (see the attached screenshot)
To Reproduce
Version information:
Additional context
It's still computing (see *), just is slow for these 28x28 =784 plots. You can either turn this off or limit to 28xn where n = the number of target variables. See this page
Thanks. I realized that it has a problem with a column that contained mixed letters and numbers. I dropped the column and it worked OK.
I have same problem. My dataset have 5k record and 31 column. When profiling start process stucs at "Summarize dataset: 81% / Get scatter matrix". There is no mixed type column in dataset i changed all column types to float via astype(float) function of pandas. We want to use Pandas Profiling in our product but issues makes trouble at poc stage.
@enesMesut Will be improved in the next version. For now either turn the scatterplots off (interactions={'continuous': False}
) or select particular columns for which you're interested in obtaining them (interactions={'targets':['col1', 'col2']}
)
@sbrugman Thanks for the fix. It worked. Please let us know when you found a complete solution to the problem. One more thing: Dates are not currently supported. It would be great if they can also be summarized in the profiling.