doric
doric copied to clipboard
Coverage of Spark types
This issue tracks coverage of Spark types by doric. For each Spark type, doric must allow us:
- To create a doric column of the corresponding Scala type
- To collect Scala values from fields of
Row
s - To create doric columns of the corresponding Scala type from literal values
The underlying Spark data type assigned by doric for a given Scala type T
should be the data type resolved by Spark's schemaFor[T]
`. This data type will be determined statically by doric (through implicits), unlike the reflective approach followed by Spark.
List of types (as of Spark 3.2.1):
Null type
-
NullType
- [x]
Null
- [x]
Numeric types
-
IntegerType
- [x]
Int
- [x]
java.lang.Integer
- [x]
-
LongType
- [x]
Long
- [x]
java.lang.Long
- [x]
-
FloatType
- [x]
Float
- [x]
java.lang.Float
- [x]
-
DoubleType
- [x]
Double
- [x]
java.lang.Double
- [x]
-
ShortType
- [x]
Short
- [x]
java.lang.Short
- [x]
-
ByteType
- [x]
Byte
- [x]
java.lang.Byte
- [x]
-
DecimalType
- [ ]
Decimal
- [ ]
BigDecimal
- [ ]
java.math.BigDecimal
- [ ]
java.math.BigInteger
- [ ]
scala.math.BigInt
- [ ]
String types
-
StringType
- [x]
String
- [ ]
Enumeration#Value
- [ ]
java.lang.Enum[_]
- [x]
Binary type
-
BinaryType
- [ ]
Array[Byte]
- [ ]
Boolean type
-
BooleanType
- [x]
Boolean
- [x]
java.lang.Boolean
- [x]
Datetime type
-
DateType
- [x]
java.sql.Date
- [x]
java.time.LocalDate
- [x]
-
TimestampType
- [x]
java.sql.Timestamp
- [x]
java.time.Instant
- [x]
-
CalendarIntervalType
- [ ]
org.apache.spark.unsafe.types.CalendarInterval
- [ ]
Interval type
-
DayTimeIntervalType
- [ ]
java.time.Duration
- [ ]
-
YearMonthIntervalType
- [ ]
java.time.Period
- [ ]
Array type
-
ArrayType
- [x]
Array[_]
- [x]
Seq[_]
- [x]
Set[_]
- [x]
Map type
-
MapType
- [x]
Map[_, _]
- [x]
Option types
- Spark type for
T
- [x]
Option[T]
- [x]
Struct types
-
StructType
- [x]
Product
(standard or user-defined case classes, in particular) - [x]
Row
- [x]
User-defined types
- Spark type
- [ ]
SQLUserDefinedType
- [ ]