tpot
tpot copied to clipboard
Add the ability to pickle TPOTRegressor object
The following text is more of a feature request than a bug report.
I have been using Flask to create a web app before trying TPOT. Flask requires a pickle file of the model to predict and display the result on the web app. However, since the TPOT object is not pickle-able, the web app isn't functional anymore.
It would be great if the ability to pickle the TPOTRegressor object is added.
I guess you can just pickle the final pipeline object (tpot.fitted_pipeline_), which is a normal sklearn pipeline and can be used independently from tpot.
I guess you can just pickle the final pipeline object (tpot.fitted_pipeline_), which is a normal sklearn pipeline and can be used independently from tpot.
I tried but doing that returned an AttributeError.
Where did you get the error? I tried it and it worked.
Where did you get the error? I tried it and it worked.
Can you show the code that works?
After fitting your tpot object you just call
pipeline_dump = pickle.dumps(tpot.fitted_pipeline_) pipeline = pickle.loads(pipeline_dump) print(pipeline)
Thanks for the code. It worked. But when I loaded the model on Flask, I'm getting a strange error.
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature
The web app I am creating requires 2 parameters longitude and latitude to show prediction.
Here is the Colab notebook link: https://colab.research.google.com/drive/178OrUuWEigZC-y2A2O83AVDGZdwxIsoW?usp=sharing
How do you call the pipeline when you do predictions and what is the shape and type of your input data.
you can do predictions by executing pipeline.predict(data) where data can be a data frame or matrix with the same number of columns as you used during training.
(Your link can't be accessed)
I called the pipeline using pickle.load. The shape of the data is 18x3. The datatype is float. The link is accessible now. I'll try pipeline.predict to make predictions.
you called .predict(y) but you have to call .predict(X).
You are right. Actually, I was checking what is the reason for the error. Turns out that when I did .predict(y), I got the same error as I mentioned above. However, in the Flask code, I'm feeding X and still getting the same error as I got in .predict(y).
Like said above, the shape of your input data must match exactly the shape during training. You pass a list, so it's just 1 column with two rows. You need to do something like np.column_stack((a, b)), but there are different ways.
Can you show a practical example of reshaping the data? Or can you please do the same in the notebook's link so I can get an idea?
np.column_stack([list-variable])
This doesn't work. Also, I'm getting an error message saying that the features should be 2D while the target should be 1D.
just print the type and shape of your input. There are many threads on stackoverflow how you can convert it into 2D.
When converted into 2D, this is what I got.
Also, as seen here, TPOT lacks multi-label regression ability.
Do you have multiple variables at output? If not then it must work. Just check the shapes of your input X and your target y.
No, I have a single variable at the output. The shape of X is (18, 2) and that of y is (18, 1).
But then it's not multi label.
Btw. I recommend that you secure your notebook again to not get security issues.
But then it's not multi label.
So what's the solution now?
As explained, pickling is possible with tpot, also you don't have a multi label problem here, you just have to shape your data correctly.
@hanshupe thank you for answer the question herein.
@RafeyIqbalRahman I think the y
's shape should (18, )
instead of (18, 1)
if y
is a 1-D array.
@hanshupe thank you for answer the question herein. @RafeyIqbalRahman I think the
y
's shape should(18, )
instead of(18, 1)
ify
is a 1-D array.
I reshaped y using .flatten but still the Flask app is not showing the predicted result.
@RafeyIqbalRahman I cannot check your Colab notebook (maybe permission issue?) and am not sure why Flask app is not working. Could you please provide a demo for reproducing the issue of pickling tpot.fitted_pipeline_
?
@weixuanfu this is the link to my Colab notebook: https://colab.research.google.com/drive/1jVRhIZEV8rjdsQFPvWJof9R7wJcdF-cd?usp=sharing. I feel there's some issue with the pickle file that's why the Flask app is not working.
I have a quick look. I think final_features
only has 1 feature but the model was fitted with 2 features (X.shape=(18,2)
), which make the prediction did not work. You can add a line like print(final_features.shape)
before prediction = model.predict(final_features)
to check that.
Thanks. Since final_features is a list object, I used len(final_features) to get the length and the length turned out to be 1 and the model is fitted with 2 features. How to solve this?
When I tried to reshape final_features, I got a ValueError saying that an array of size 1 cannot be reshaped into shape (1,2).