sklearn-onnx
sklearn-onnx copied to clipboard
Batch inference
Hello!
I'm looking to be able to do batch prediction using a model converted from SKL to an ONNXruntime backend. I've found that the batch prediction only sporadically works on my local machine and also a virtual machine. Errors are a generic "MemoryError" with no further detail:
The model in itself is an OVR RandomForests model. Inputs in this case was a 115k * 2853 matrix (count vectorized text over 115k records with 2853 tokens).
I've been told that this is a bug within sklearn-onnx: as the original sklearn models handle the batch prediction without issue. Is it possible to get batch prediction supported within the onnx objects as well? Would be a huge help for our team :)
I tried this and didn't see the error:
X, y = make_classification(n_features=2853, n_samples=230000, random_state=42, n_classes=3, n_informative=7)
X = X.astype(np.float32)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)
model = OneVsRestClassifier(RandomForestClassifier(random_state=42))
model.fit(X_train, y_train)
onnx_fs = convert_sklearn(model, 'rf', [('float_input', FloatTensorType([None, X_test.shape[1]]))])
save_model(onnx_fs, 'rf_cla.onnx')
sess = InferenceSession('rf_cla.onnx')
res = sess.run('', input_feed={'float_input': X_test})
print(np.mean(np.isclose(model.predict_proba(X_test), list(map(lambda x:
list(map(lambda y: x[y], x)), res[1])), atol=1e-7)))
print(np.mean(res[0] == np.argmax(model.predict_proba(X_test), axis=1)))
Can you try a smaller batch size maybe? How much RAM do you have on your machine?
I can score mini-batches and re-assemble at the end if needed as a short term fix. My local machine is 32gb ram (where it succeeds if i ensure there aren't many other tasks running). The DevOps machines I've checked list 7.6gb RAM.
Would we expect ONNX-runtime to have a higher memory requirement than scikit-learn? I thought the problem was that sklearn models are able to run successfully while the ONNX-runtime throws the error.
I have 16 GB RAM, and haven't seen this error. I can try with a higher batch size. By the way, can you share your code which gives the error?
Sure! Apologies for the slow response. Unfortunately, there isn't much more than what I shared in the snapshot above.
#Load appropriate CV object. In this case, has 2853 attributes cv = cv_dict[ModelId] #call NLP library for stopwords/lemmatization, etc #currently have ~115k texts data_toScore['ProcessedText'] = NLP.normalize(data_toScore['Text']) ModelBackend = backend.prepare(RF_Model_Map[ModelId].SerializeToString(), 'CPU') labels, probas = ModelBackend.run(cv.transform(data_toScore.ProcessedText).todense().astype(np.float32))
Error occurs on final line. Actually, I'm now wondering if the conversion of the CSR to a dense matrix of floats is what's causing the issue. I'll try to set up some testing for this on one of the build machines and report the results. Will ONNX-runtime support CSR format as a input eventually?
That could well be the reason. @xadupre Do you know if we'll support sparse format in onnxruntime anytime soon?
Sparse tensors are not currently high priority for Onnxruntime. We could write the converters to create an ONNX graph using sparse vectors but it would be difficult to make sure it produces the same outputs as there is no runtime fully supporting sparse tensors.
Closing the issue, feel free to reopen it.