hummingbird
hummingbird copied to clipboard
Error when converting Json model generated by XGBoost to Pytorch
Hello,
I am trying to convert my model into a pytorch model. My model is generated by XGBoost in python and is saved in the Json format, as follows: . . .
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, stratify=Y, random_state=1) enc = OneHotEncoder(handle_unknown='ignore') Y_train = enc.fit_transform(y_train.values.reshape(-1,1)).toarray() Y_test = enc.fit_transform(y_test.values.reshape(-1,1)).toarray() train = np.argmax(Y_train,axis = 1) test = np.argmax(Y_test, axis = 1) clf = XGBClassifier(learning_rate=0.1, n_estimators=400, max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8, objective='multi:softprob', nthread=4, num_class=3, seed=27, predictor = 'gpu_predictor', tree_method='gpu_hist', gpu_id=0)
clf.fit(X_train,train.ravel(), verbose=False, early_stopping_rounds=50, eval_metric='merror', eval_set=[(X_test, test.ravel())])
y_predict = clf.predict(X_test) y_train_predict = clf.predict(X_train)
clf.save_model('model.json')
model = XGBClassifier()
model.load_model('model.json')
here i try to convert it
from hummingbird.ml import convert
modelc = convert(model, 'pytorch', extra_config={"n_features":54}) modelc.save('hb_model')
###ERROR
ValueError Traceback (most recent call last)
. . .
ValueError: invalid literal for int() with base 10: 'chroma_stft :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Do you know want can i possibly be doing wrong?
Hi! Maybe but converting the model to JSON and back some field get erased in the process. I will look into it and keep you posted.
On Thu, Dec 2, 2021, 8:33 AM rodsang @.***> wrote:
Hello,
I am trying to convert my model into a pytorch model. My model is generated by XGBoost in python and is saved in the Json format, as follows: . . .
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, stratify=Y, random_state=1) enc = OneHotEncoder(handle_unknown='ignore') Y_train = enc.fit_transform(y_train.values.reshape(-1,1)).toarray() Y_test = enc.fit_transform(y_test.values.reshape(-1,1)).toarray() train = np.argmax(Y_train,axis = 1) test = np.argmax(Y_test, axis = 1) clf = XGBClassifier(learning_rate=0.1, n_estimators=400, max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8, objective='multi:softprob', nthread=4, num_class=3, seed=27, predictor = 'gpu_predictor', tree_method='gpu_hist', gpu_id=0)
clf.fit(X_train,train.ravel(), verbose=False, early_stopping_rounds=50, eval_metric='merror', eval_set=[(X_test, test.ravel())])
y_predict = clf.predict(X_test) y_train_predict = clf.predict(X_train)
clf.save_model('model.json')
model = XGBClassifier()
model.load_model('model.json') here i try to convert it
from hummingbird.ml import convert
modelc = convert(model, 'pytorch', extra_config={"n_features":54}) modelc.save('hb_model')
###ERROR
ValueError Traceback (most recent call last) in 4 5 ----> 6 modelc = convert(model, 'pytorch', extra_config={"n_features":54}) 7 #modelc.save('hb_model')
. . .
ValueError: invalid literal for int() with base 10: 'chroma_stft
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Do you know want can i possibly be doing wrong?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/microsoft/hummingbird/issues/551, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALWR54KQLNXUI23XBDK5LTUO6NVBANCNFSM5JHSCZWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I'd just like to say that I tried to do it without saving in jason :
clf.save_model('model') model = XGBClassifier()
model.load_model('model') #--------------------------------------- from hummingbird.ml import convert
modelc = convert(clf, 'pytorch') #modelc.save('hb_model')
!!!ERROR
ValueError: invalid literal for int() with base 10: 'chroma_stft'
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
and the same error happens
I see. What's the XGBoost version you are using?
'1.5.0' Im using this one
hello, I changed it to this : `from hummingbird.ml import convert
m=convert(model, "pytorch") print(m)
#modelc = convert(clf, 'pytorch') m.save('hb_model')`
And I could download a file , it comes with "hb_model.zip" It's a zipped folder. I try to open it and I can't. I just wanted to get a pytorch model... Is this normal ? this ZIP folder is expectable? how can I get my pytorch model? I tried to save it as m.save('hb_model.pt')
and the saved file is hb_model.pt.zip, again a zipped folder
Yes our model is a zip because it contains a bunch of stuff, not just the pytorch model. But you can load the zip file using hummingbird.ml.load('hb_model')
. From there you should see the pytorch model.
from hummingbird.ml import convert
m=convert(model, "pytorch") #print(m)
#modelc = convert(clf, 'pytorch') m.save('hb_model') mh=m.load('hb_model') print(mh)
-----> output: <hummingbird.ml.containers.sklearn.pytorch_containers.PyTorchSklearnContainerClassification object at 0x7fda6e05d4d0>
Ok, its no giving an error.. But Id like to save this pytorch model in a file because i need to use it later for something else (I neet to load this pytorch model in another place), is this possible?
Ok in this case you can just save m.model
. m.model
is a pytorch model so you can use it as such.
closed with #562