mleap
mleap copied to clipboard
org.apache.spark.sql.mleap.TypeConverters can not convert 2D tensor to Matrix
Current implementation always will convert Tensor to Vector
Bug is hidden in tt.dimensions.size
where tt.dimensions
is Option[Seq[Int]]
, so calling size
on Some will have size of 1 and calling size
on None will have size of 0. So in following code, TensorType
will always convert to VectorUDT
def mleapTensorToSpark(tt: types.TensorType): DataType = {
assert(TypeConverters.VECTOR_BASIC_TYPES.contains(tt.base),
s"cannot convert tensor with base ${tt.base} to vector")
assert(tt.dimensions.isDefined, "cannot convert tensor with undefined dimensions")
if(tt.dimensions.isEmpty) {
mleapBasicTypeToSparkType(tt.base)
} else if(tt.dimensions.size == 1) {
new VectorUDT
} else if(tt.dimensions.size == 2) {
new MatrixUDT
} else {
throw new IllegalArgumentException("cannot convert tensor for non-scalar, vector or matrix tensor")
}
}
Same bug exists in mleapToSparkValue
function as well.