onnxmltools
onnxmltools copied to clipboard
How do I convert to ONNX a Spark model with multiple input columns and use it for scoring dynamic batch size?
Hi,
I converted a logistic regression model with dynamic batch size from Spark ML to ONNX using this:
initial_types = [('Features', FloatTensorType([None, 5]))] onnx_model = convert_sparkml(s_clf, 'Occupancy detection Pyspark Logistic Regression model', initial_types, spark_session = sess)
Then I successfully scored df1, a dynamic batch of samples whose shape is (12417, 5) using the code below:
import onnxruntime as rt sess = rt.InferenceSession(bmodel) input_name = sess.get_inputs()[0].name label_name = sess.get_outputs()[0].name df1 = df[features_cols] predictions = sess.run([label_name], {input_name: df1.values.astype(np.float32)})[0]
Now I try to build a pipeline and convert to ONNX. I tried to convert the first stage of it, which is just a VectorAssembler:
initial_types = [ ('Temperature', FloatTensorType([None, 1])), ('Humidity', FloatTensorType([None, 1])), ('Light', FloatTensorType([None, 1])), ('CO2', FloatTensorType([None, 1])), ('HumidityRatio', FloatTensorType([None, 1])), ] onnx_model = convert_sparkml(assembler, 'Occupancy detection Pyspark Assembler of features', initial_types, spark_session = sess).
Trying to consume it using this code:
predictions = sess.run([label_name], { "Temperature": [df1.Temperature.values.astype(np.float32)], "Humidity": [df1.Humidity.values.astype(np.float32)], "Light": [df1.Light.values.astype(np.float32)], "CO2": [df1.CO2.values.astype(np.float32)], "HumidityRatio": [df1.HumidityRatio.values.astype(np.float32)], })[0]
fails, with [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: Light for the following indices index: 1 Got: 12417 Expected: 1.
Just for testing, I selected a single samples by adding df1 = df1[:1], then the code above works..
How can I export a model with multiple input columns like above so I could run it on dynamic batch size? How come Logistic Regression works flawlessly, and this simple VectorAssembler fails?
Thanks for your help, Adi
were you able to solve that issue ?