mleap
mleap copied to clipboard
how to convert spark dataframe to mleap tensor[double] straightly
Hi : I know how to convert mleap tensor to tensorflow tensor use our package, but I don't know how to make spark dataframe convert to mleap tensor[Double], I have found the method [TypeConverters.sparkToMleapValue()] ,but I don't know how to use,could you support me give a tutorial for this,thanks
Use org.apache.spark.sql.mleap.TypeConverters
def sparkToMleapConverter(dataset: DataFrame,
field: StructField): (types.StructField, (Any) => Any) = {
(sparkFieldToMleapField(dataset, field), sparkToMleapValue(field.dataType))
}
from spark dataframe can not get tensor[Double]
As a caveat, converting from spark to mleap is kind of an unusual thing which we don't usually need to do. If you have a spark session and dataframe, then just do things with spark. mleap runtime is more for when you don't have a spark session (e.g., in a real time inference service).
That said, the sparkToMleapConverter is the way to do the conversion of a single field if you really need to. If you need to convert the entire dataframe, then toSparkLeapFrame is probably easier. Take a look at the toSparkLeapFrame code to see how to use the sparkToMleapConverter. You use that just by adding import ml.combust.mleap.spark.SparkSupport._.
Looking at sparkFieldToMleapField code you will need to have a spark VectorUDT, MatrixUDT, or an Array[VectorUDT] in order for it to be converted to an mleap tensor.
As a caveat, converting from spark to mleap is kind of an unusual thing which we don't usually need to do. If you have a spark session and dataframe, then just do things with spark. mleap runtime is more for when you don't have a spark session (e.g., in a real time inference service).
That said, the
sparkToMleapConverteris the way to do the conversion of a single field if you really need to. If you need to convert the entire dataframe, thentoSparkLeapFrameis probably easier. Take a look at the toSparkLeapFrame code to see how to use the sparkToMleapConverter. You use that just by addingimport ml.combust.mleap.spark.SparkSupport._.Looking at
sparkFieldToMleapFieldcode you will need to have a spark VectorUDT, MatrixUDT, or an Array[VectorUDT] in order for it to be converted to an mleap tensor.
Ok ,thank ,Now use our package ,I make from spark dataframe normally generate tensorflow-java tensor & NdArray !