doric icon indicating copy to clipboard operation
doric copied to clipboard

Coverage of Spark types

Open jserranohidalgo opened this issue 2 years ago • 0 comments

This issue tracks coverage of Spark types by doric. For each Spark type, doric must allow us:

  • To create a doric column of the corresponding Scala type
  • To collect Scala values from fields of Rows
  • To create doric columns of the corresponding Scala type from literal values

The underlying Spark data type assigned by doric for a given Scala type T should be the data type resolved by Spark's schemaFor[T]`. This data type will be determined statically by doric (through implicits), unlike the reflective approach followed by Spark.

List of types (as of Spark 3.2.1):

Null type

  • NullType
    • [x] Null

Numeric types

  • IntegerType

    • [x] Int
    • [x] java.lang.Integer
  • LongType

    • [x] Long
    • [x] java.lang.Long
  • FloatType

    • [x] Float
    • [x] java.lang.Float
  • DoubleType

    • [x] Double
    • [x] java.lang.Double
  • ShortType

    • [x] Short
    • [x] java.lang.Short
  • ByteType

    • [x] Byte
    • [x] java.lang.Byte
  • DecimalType

    • [ ] Decimal
    • [ ] BigDecimal
    • [ ] java.math.BigDecimal
    • [ ] java.math.BigInteger
    • [ ] scala.math.BigInt

String types

  • StringType
    • [x] String
    • [ ] Enumeration#Value
    • [ ] java.lang.Enum[_]

Binary type

  • BinaryType
    • [ ] Array[Byte]

Boolean type

  • BooleanType
    • [x] Boolean
    • [x] java.lang.Boolean

Datetime type

  • DateType

    • [x] java.sql.Date
    • [x] java.time.LocalDate
  • TimestampType

    • [x] java.sql.Timestamp
    • [x] java.time.Instant
  • CalendarIntervalType

    • [ ] org.apache.spark.unsafe.types.CalendarInterval

Interval type

  • DayTimeIntervalType

    • [ ] java.time.Duration
  • YearMonthIntervalType

    • [ ] java.time.Period

Array type

  • ArrayType
    • [x] Array[_]
    • [x] Seq[_]
    • [x] Set[_]

Map type

  • MapType
    • [x] Map[_, _]

Option types

  • Spark type for T
    • [x] Option[T]

Struct types

  • StructType
    • [x] Product (standard or user-defined case classes, in particular)
    • [x] Row

User-defined types

  • Spark type
    • [ ] SQLUserDefinedType

jserranohidalgo avatar Jun 10 '22 09:06 jserranohidalgo