tpot
tpot copied to clipboard
TPOTClassifier error for large data
I am getting the following error
RuntimeError: There was an error in the TPOT optimization process. This could be because the data was not formatted properly, or because data for a regression problem was provided to the TPOTClassifier object. Please make sure you passed the data to TPOT correctly.
My Current best internal cv score is -inf . Even though the optimisation progress bar is displaying 75%
Even though it is working for smaller dataset , I am getting the erro for those having 200000 rows and 20 columns. I am currently using TPOT version 12.0 Is there any specific reason i am getting this?
Can you please help me to resolve this error. Thank you.
I would recommend trying out TPOT2, the next version of TPOT. You can find it here: https://github.com/EpistasisLab/tpot2
This version is more stable with larger datasets compared to TPOT1. There is also a memory_limit
parameter that you can use to set the maximum amount of RAM a single pipeline can take up.
For TPOT1: Perhaps it is simply running out of RAM and crashing?
Some suggestions:
You could try to reduce RAM usage by lowering n_jobs.
you could try editing the configuration dictionary to use smaller/faster models.
One possibility is that fitting the pipeline is taking too long and timing out. You can increase the timeout by setting the parameter max_eval_time_mins
.