ydata-profiling
ydata-profiling copied to clipboard
Feature Request: Support for modin framework to make the EDA on larger datasets much faster
Missing functionality
Support for modin framework to make the EDA on larger datasets much faster
I would like to request the addition of support for the Modin framework to our EDA tools. As datasets become larger and more complex, performing EDA using traditional tools like pandas can become challenging due to limitations in memory and processing power. Modin is designed to provide efficient and scalable data processing capabilities for large datasets by using distributed computing techniques to perform operations in parallel. This results in a significant reduction in computation time, enabling data analysts to analyze and visualize large datasets more quickly and efficiently, leading to faster insights and decision-making.
Additionally, Modin offers a seamless interface built on top of pandas, which allows users to leverage the full power of distributed computing without needing to learn new syntax or concepts. This makes it an accessible and user-friendly solution for data analysts and scientists who want to work with large datasets without needing to learn new tools or techniques. With Modin, users can simply install the framework and begin using it immediately with their existing pandas code.
By incorporating Modin into our EDA tools, we can significantly improve the speed and accuracy of data analysis, leading to better insights and decision-making. Therefore, we request the addition of support for the Modin framework in our EDA tools to help us handle large datasets more efficiently.
Proposed feature
Basically, the User does not seem to know how the inners are working by simply using pandas-profiling in backend if data size is larger simply use modin.pandas instead of pandas.
Alternatives considered
No response
Additional context
Modin
Hi @danishbansal808 ,
thank you for the detailed request! At this moment we don't have Modin in our roadmap. There is not yet a lot of request from the community for the support of the framework.
If more users are interested on this and the feature is up voted we will consider it for the roadmap.
Hi @danishbansal808 ,
thank you for the detailed request! At this moment we don't have Modin in our roadmap. There is not yet a lot of request from the community for the support of the framework.
If more users are interested on this and the feature is up voted we will consider it for the roadmap.
Hey @fabclmnt Do we have any plans in the future roadmap to make ydata-profiling
library compatible with modin
framework so we can leverage the full power of distributed computing for profiling the huge datasets?