spark-redshift Dataframe save gives exception for nulltypes

When i try to write a null column value to a redshift table dataframe.save throws an exception as unexpected datatype nulltype

http://stackoverflow.com/q/35966006/110449:

I am working with pyspark and saving dataframe to redshift. I am getting the below error when trying to save it to

": java.lang.UnsupportedOperationException: Unexpected type NullType. at com.databricks.spark.avro.SchemaConverters$.com$databricks$spark$avro$SchemaConverters$$convertFieldTypeToAvro(SchemaConverters.scala:283) "

When I look athte source code for ScehmaCovnerters : https://github.com/databricks/spark-avro/blob/master/src/main/scala/com/databricks/spark/avro/SchemaConverters.scala

I can see no support for NullType. I have columns in dataframe which are null.

What is the solution to this?

Mar 15 '16 00:03 ibnipun10

Any update on this issue ?

Apr 22 '16 18:04 nadirvardar

The type NullType usually occurs as the type of null literals; if you have a column of some other type which happens to contain nulls then you'll have a nullable field of that type (e.g. a nullable IntType field). The problem with NullType is that we don't know which SQL type it should map to and, as a result, do not know which type to assign to the column in the CREATE TABLE statement.

Given this limitation, I don't think that you'll be able to create a new table if your schema contains a field with NullType. However, I think that you probably should be able to append to an existing table.

Therefore, I think there are a few things we could fix here:

Give a more informative error message when trying to create a new table (or completely overwrite an existing one) if the DataFrame's schema contains a field whose type is NullType.
When appending to an existing table, use the existing table's schema to replace the NullType by the type retrieved from the existing table.
Open tickets against Spark in case we find any cases where NullType is being inappropriately inferred / used in schemas.

Apr 23 '16 01:04 JoshRosen

I'm trying to append to existing table and I'm still getting this error

Apr 11 '18 02:04 farshidz

@farshidz same here

Sep 11 '18 22:09 smats0

@ibnipun10 If you cast the null column to a spark sql supported type. It solves the issue. Example: lit(null).cast(DoubleType)) in scala and lit(None).cast(DoubleType()) in python

Feb 24 '19 08:02 meetchandan

spark-redshift spark-redshift copied to clipboard

Dataframe save gives exception for nulltypes

spark-redshift
spark-redshift copied to clipboard