multielo icon indicating copy to clipboard operation
multielo copied to clipboard

Performance issues

Open antl3x opened this issue 2 years ago • 3 comments

Hey @djcunningham0. First, congrats for the amazing repo / project!

I'm opening this issue to discuss some improvements in the process_data algorithm.

I run a dataset of ~ 70k rows (matches) and it takes >140 minutes to finish.

Maybe we can do some changes to speed up things? Maybe use Numba?

Insights:

https://python.plainenglish.io/a-solution-to-boost-python-speed-1000x-times-c9e7d5be2f40

https://towardsdatascience.com/how-to-make-your-pandas-operation-100x-faster-81ebcd09265c

antl3x avatar Jun 09 '22 21:06 antl3x