hummingbird icon indicating copy to clipboard operation
hummingbird copied to clipboard

Error when converting Json model generated by XGBoost to Pytorch

Open rodsang opened this issue 3 years ago • 8 comments

Hello,

I am trying to convert my model into a pytorch model. My model is generated by XGBoost in python and is saved in the Json format, as follows: . . .

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, stratify=Y, random_state=1) enc = OneHotEncoder(handle_unknown='ignore') Y_train = enc.fit_transform(y_train.values.reshape(-1,1)).toarray() Y_test = enc.fit_transform(y_test.values.reshape(-1,1)).toarray() train = np.argmax(Y_train,axis = 1) test = np.argmax(Y_test, axis = 1) clf = XGBClassifier(learning_rate=0.1, n_estimators=400, max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8, objective='multi:softprob', nthread=4, num_class=3, seed=27, predictor = 'gpu_predictor', tree_method='gpu_hist', gpu_id=0)

clf.fit(X_train,train.ravel(), verbose=False, early_stopping_rounds=50, eval_metric='merror', eval_set=[(X_test, test.ravel())])

y_predict = clf.predict(X_test) y_train_predict = clf.predict(X_train)

clf.save_model('model.json')

model = XGBClassifier()

model.load_model('model.json')

here i try to convert it

from hummingbird.ml import convert

modelc = convert(model, 'pytorch', extra_config={"n_features":54}) modelc.save('hb_model')

###ERROR


ValueError Traceback (most recent call last) in 4 5 ----> 6 modelc = convert(model, 'pytorch', extra_config={"n_features":54}) 7 #modelc.save('hb_model')

. . .

ValueError: invalid literal for int() with base 10: 'chroma_stft :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Do you know want can i possibly be doing wrong?

rodsang avatar Dec 02 '21 16:12 rodsang

Hi! Maybe but converting the model to JSON and back some field get erased in the process. I will look into it and keep you posted.

On Thu, Dec 2, 2021, 8:33 AM rodsang @.***> wrote:

Hello,

I am trying to convert my model into a pytorch model. My model is generated by XGBoost in python and is saved in the Json format, as follows: . . .

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, stratify=Y, random_state=1) enc = OneHotEncoder(handle_unknown='ignore') Y_train = enc.fit_transform(y_train.values.reshape(-1,1)).toarray() Y_test = enc.fit_transform(y_test.values.reshape(-1,1)).toarray() train = np.argmax(Y_train,axis = 1) test = np.argmax(Y_test, axis = 1) clf = XGBClassifier(learning_rate=0.1, n_estimators=400, max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8, objective='multi:softprob', nthread=4, num_class=3, seed=27, predictor = 'gpu_predictor', tree_method='gpu_hist', gpu_id=0)

clf.fit(X_train,train.ravel(), verbose=False, early_stopping_rounds=50, eval_metric='merror', eval_set=[(X_test, test.ravel())])

y_predict = clf.predict(X_test) y_train_predict = clf.predict(X_train)

clf.save_model('model.json')

model = XGBClassifier()

model.load_model('model.json') here i try to convert it

from hummingbird.ml import convert

modelc = convert(model, 'pytorch', extra_config={"n_features":54}) modelc.save('hb_model')

###ERROR

ValueError Traceback (most recent call last) in 4 5 ----> 6 modelc = convert(model, 'pytorch', extra_config={"n_features":54}) 7 #modelc.save('hb_model')

. . .

ValueError: invalid literal for int() with base 10: 'chroma_stft

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Do you know want can i possibly be doing wrong?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/microsoft/hummingbird/issues/551, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALWR54KQLNXUI23XBDK5LTUO6NVBANCNFSM5JHSCZWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

interesaaat avatar Dec 02 '21 17:12 interesaaat

I'd just like to say that I tried to do it without saving in jason :

clf.save_model('model') model = XGBClassifier()

model.load_model('model') #--------------------------------------- from hummingbird.ml import convert

modelc = convert(clf, 'pytorch') #modelc.save('hb_model')

!!!ERROR

ValueError: invalid literal for int() with base 10: 'chroma_stft'

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

and the same error happens

rodsang avatar Dec 02 '21 18:12 rodsang

I see. What's the XGBoost version you are using?

interesaaat avatar Dec 02 '21 18:12 interesaaat

'1.5.0' Im using this one

rodsang avatar Dec 02 '21 18:12 rodsang

hello, I changed it to this : `from hummingbird.ml import convert

m=convert(model, "pytorch") print(m)

#modelc = convert(clf, 'pytorch') m.save('hb_model')`

And I could download a file , it comes with "hb_model.zip" It's a zipped folder. I try to open it and I can't. I just wanted to get a pytorch model... Is this normal ? this ZIP folder is expectable? how can I get my pytorch model? I tried to save it as m.save('hb_model.pt') and the saved file is hb_model.pt.zip, again a zipped folder

angecas avatar Dec 02 '21 23:12 angecas

Yes our model is a zip because it contains a bunch of stuff, not just the pytorch model. But you can load the zip file using hummingbird.ml.load('hb_model'). From there you should see the pytorch model.

interesaaat avatar Dec 02 '21 23:12 interesaaat

from hummingbird.ml import convert

m=convert(model, "pytorch") #print(m)

#modelc = convert(clf, 'pytorch') m.save('hb_model') mh=m.load('hb_model') print(mh)

-----> output: <hummingbird.ml.containers.sklearn.pytorch_containers.PyTorchSklearnContainerClassification object at 0x7fda6e05d4d0>

Ok, its no giving an error.. But Id like to save this pytorch model in a file because i need to use it later for something else (I neet to load this pytorch model in another place), is this possible?

angecas avatar Dec 02 '21 23:12 angecas

Ok in this case you can just save m.model. m.model is a pytorch model so you can use it as such.

interesaaat avatar Dec 03 '21 00:12 interesaaat

closed with #562

ksaur avatar Aug 19 '22 17:08 ksaur