Error when converting Json model generated by XGBoost to Pytorch
Hello,
I am trying to convert my model into a pytorch model. My model is generated by XGBoost in python and is saved in the Json format, as follows: . . .
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, stratify=Y, random_state=1) enc = OneHotEncoder(handle_unknown='ignore') Y_train = enc.fit_transform(y_train.values.reshape(-1,1)).toarray() Y_test = enc.fit_transform(y_test.values.reshape(-1,1)).toarray() train = np.argmax(Y_train,axis = 1) test = np.argmax(Y_test, axis = 1) clf = XGBClassifier(learning_rate=0.1, n_estimators=400, max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8, objective='multi:softprob', nthread=4, num_class=3, seed=27, predictor = 'gpu_predictor', tree_method='gpu_hist', gpu_id=0)
clf.fit(X_train,train.ravel(), verbose=False, early_stopping_rounds=50, eval_metric='merror', eval_set=[(X_test, test.ravel())])
y_predict = clf.predict(X_test) y_train_predict = clf.predict(X_train)
clf.save_model('model.json')
model = XGBClassifier()
model.load_model('model.json')
here i try to convert it
from hummingbird.ml import convert
modelc = convert(model, 'pytorch', extra_config={"n_features":54}) modelc.save('hb_model')
###ERROR
ValueError Traceback (most recent call last)
. . .
ValueError: invalid literal for int() with base 10: 'chroma_stft :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Do you know want can i possibly be doing wrong?
Hi! Maybe but converting the model to JSON and back some field get erased in the process. I will look into it and keep you posted.
On Thu, Dec 2, 2021, 8:33 AM rodsang @.***> wrote:
Hello,
I am trying to convert my model into a pytorch model. My model is generated by XGBoost in python and is saved in the Json format, as follows: . . .
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, stratify=Y, random_state=1) enc = OneHotEncoder(handle_unknown='ignore') Y_train = enc.fit_transform(y_train.values.reshape(-1,1)).toarray() Y_test = enc.fit_transform(y_test.values.reshape(-1,1)).toarray() train = np.argmax(Y_train,axis = 1) test = np.argmax(Y_test, axis = 1) clf = XGBClassifier(learning_rate=0.1, n_estimators=400, max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8, objective='multi:softprob', nthread=4, num_class=3, seed=27, predictor = 'gpu_predictor', tree_method='gpu_hist', gpu_id=0)
clf.fit(X_train,train.ravel(), verbose=False, early_stopping_rounds=50, eval_metric='merror', eval_set=[(X_test, test.ravel())])
y_predict = clf.predict(X_test) y_train_predict = clf.predict(X_train)
clf.save_model('model.json')
model = XGBClassifier()
model.load_model('model.json') here i try to convert it
from hummingbird.ml import convert
modelc = convert(model, 'pytorch', extra_config={"n_features":54}) modelc.save('hb_model')
###ERROR
ValueError Traceback (most recent call last) in 4 5 ----> 6 modelc = convert(model, 'pytorch', extra_config={"n_features":54}) 7 #modelc.save('hb_model')
. . .
ValueError: invalid literal for int() with base 10: 'chroma_stft
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Do you know want can i possibly be doing wrong?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/microsoft/hummingbird/issues/551, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALWR54KQLNXUI23XBDK5LTUO6NVBANCNFSM5JHSCZWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I'd just like to say that I tried to do it without saving in jason :
clf.save_model('model') model = XGBClassifier()
model.load_model('model') #--------------------------------------- from hummingbird.ml import convert
modelc = convert(clf, 'pytorch') #modelc.save('hb_model')
!!!ERROR
ValueError: invalid literal for int() with base 10: 'chroma_stft'
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
and the same error happens
I see. What's the XGBoost version you are using?
'1.5.0' Im using this one
hello, I changed it to this : `from hummingbird.ml import convert
m=convert(model, "pytorch") print(m)
#modelc = convert(clf, 'pytorch') m.save('hb_model')`
And I could download a file , it comes with "hb_model.zip" It's a zipped folder. I try to open it and I can't. I just wanted to get a pytorch model... Is this normal ? this ZIP folder is expectable? how can I get my pytorch model? I tried to save it as m.save('hb_model.pt') and the saved file is hb_model.pt.zip, again a zipped folder
Yes our model is a zip because it contains a bunch of stuff, not just the pytorch model. But you can load the zip file using hummingbird.ml.load('hb_model'). From there you should see the pytorch model.
from hummingbird.ml import convert
m=convert(model, "pytorch") #print(m)
#modelc = convert(clf, 'pytorch') m.save('hb_model') mh=m.load('hb_model') print(mh)
-----> output: <hummingbird.ml.containers.sklearn.pytorch_containers.PyTorchSklearnContainerClassification object at 0x7fda6e05d4d0>
Ok, its no giving an error.. But Id like to save this pytorch model in a file because i need to use it later for something else (I neet to load this pytorch model in another place), is this possible?
Ok in this case you can just save m.model. m.model is a pytorch model so you can use it as such.
closed with #562