tpot icon indicating copy to clipboard operation
tpot copied to clipboard

XGBRegressor ValueError: feature_names mismatch:

Open EduardoUrbiola opened this issue 3 years ago • 1 comments

I am having an issue every time I tried to run the following code:

pipeline: housing2 = pd.read_csv('https://raw.githubusercontent.com/byui-cse/cse450-course/master/data/housing_holdout_test.csv')

overall_model = XGBRegressor(max_depth=6, colsample_bytree=0.7) overall_model.fit(X_train, y_train)

new_predictions = overall_model.predict(housing2[['view', 'sqft_living', 'grade', 'waterfront', 'lat', 'zipcode']]) print(new_predictions)

ValueError: feature_names mismatch: ['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'f10', 'f11', 'f12', 'f13', 'f14', 'f15', 'f16', 'f17'] ['view', 'sqft_living', 'grade', 'waterfront', 'lat', 'zipcode'] expected f5, f8, f10, f6, f12, f1, f4, f9, f13, f3, f14, f17, f0, f2, f11, f16, f7, f15 in input data training data did not have the following fields: waterfront, zipcode, view, lat, grade, sqft_living

EduardoUrbiola avatar Oct 20 '21 21:10 EduardoUrbiola

can you provide code to fully reproduce this issue?

It looks like the column names are getting renamed between housing2 to X_train. How is X_train being created?

perib avatar May 09 '23 01:05 perib