support big data use cases
great library ! it will be great if there was support on big data use cases (integration with dask/ vaex/spark) my use case has out of memory data set size and great imbalace so if i want to keep original target ratio - i need to support original data size and not down sample the data.
Hello @yair4Data, thank you for the kind words! I hope the library can be useful to you!
Are you saying you are running out of memory one converting to a pandas data frame (e.g. df = df.compute() in dask)?
Or are you getting an error message when running the report, or generating HTML?
t have the same probleam too,my data have about billion rows, but it does not work! can use the modin package?
@haiyuni I haven't looked at modin, I will do so and get back here.
Regarding the billion row issue, I am assuming you are referring to the scale issue (#73)? Or is there a specific error I should be looking at?
Thanks again!