frameless
frameless copied to clipboard
CodeGen fails when case class fields are reserved java keywords
It is possible to define a case class with reserve field names using back-ticks.
case class Foo(a: String, `if`: Int)
val t = TypedDataset.create(Seq(Foo("a",2), Foo("b",2)))
Fails with the following error:
17/06/01 00:45:54 ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 49, Column 44: Unexpected selector 'if' after "."
/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */ return new SpecificUnsafeProjection(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
/* 006 */
/* 007 */ private Object[] references;
/* 008 */ private UnsafeRow result;
/* 009 */ private org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder holder;
/* 010 */ private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter rowWriter;
/* 011 */
/* 012 */
/* 013 */ public SpecificUnsafeProjection(Object[] references) {
/* 014 */ this.references = references;
/* 015 */ result = new UnsafeRow(2);
/* 016 */ this.holder = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(result, 32);
/* 017 */ this.rowWriter = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(holder, 2);
/* 018 */ }
/* 019 */
/* 020 */ // Scala.Function1 need this
/* 021 */ public java.lang.Object apply(java.lang.Object row) {
/* 022 */ return apply((InternalRow) row);
/* 023 */ }
/* 024 */
/* 025 */ public UnsafeRow apply(InternalRow i) {
/* 026 */ holder.reset();
/* 027 */
/* 028 */ rowWriter.zeroOutNullBytes();
/* 029 */
/* 030 */
/* 031 */ $line48.$read$$iw$$iw$$iw$$iw$$iw$$iw$Foo value2 = ($line48.$read$$iw$$iw$$iw$$iw$$iw$$iw$Foo)i.get(0, null);
/* 032 */
/* 033 */ boolean isNull1 = false;
/* 034 */ final java.lang.String value1 = isNull1 ? null : (java.lang.String) value2.a();
/* 035 */ isNull1 = value1 == null;
/* 036 */ boolean isNull = isNull1;
/* 037 */ final UTF8String value = isNull ? null : org.apache.spark.unsafe.types.UTF8String.fromString(value1);
/* 038 */ isNull = value == null;
/* 039 */ if (isNull) {
/* 040 */ rowWriter.setNullAt(0);
/* 041 */ } else {
/* 042 */ rowWriter.write(0, value);
/* 043 */ }
/* 044 */
/* 045 */
/* 046 */ $line48.$read$$iw$$iw$$iw$$iw$$iw$$iw$Foo value4 = ($line48.$read$$iw$$iw$$iw$$iw$$iw$$iw$Foo)i.get(0, null);
/* 047 */
/* 048 */ boolean isNull3 = false;
/* 049 */ final int value3 = isNull3 ? -1 : value4.if();
/* 050 */ if (isNull3) {
/* 051 */ rowWriter.setNullAt(1);
/* 052 */ } else {
/* 053 */ rowWriter.write(1, value3);
/* 054 */ }
/* 055 */ result.setTotalSize(holder.totalSize());
/* 056 */ return result;
/* 057 */ }
/* 058 */ }
How does Spark escapes that?
Good question. It seems that Spark handles this at runtime:
java.lang.UnsupportedOperationException: `if` is a reserved keyword and cannot be used as field name
- root class: "Foo"
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:585)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:583)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:355)
at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:583)
at org.apache.spark.sql.catalyst.ScalaReflection$.serializerFor(ScalaReflection.scala:425)
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:61)
at org.apache.spark.sql.Encoders$.product(Encoders.scala:274)
at org.apache.spark.sql.SQLImplicits.newProductEncoder(SQLImplicits.scala:47)
... 42 elided
@OlivierBlanvillain, maybe we can have a type class that uses shapeless to check if the case class contains and reserved keywords and fail at compile time?
Totally doable, but I would put that very low in the priority list. I'm not even sure the added safety would be the extra compilation time...
Was writting an issue when I saw this one. I also encountered this with a non 'back-ticked' word : char
case class SomeCaseClass(char: String)