.option( mode failfast ) is not working
Steps to Reproduce (for bugs)
```
Excel File to load:

```
Code to load file:
%scala
import org.apache.spark.sql._
import org.apache.spark.sql.types._
val myschema = StructType(Array(
StructField("Processo", StringType, nullable = false),
StructField("Data", TimestampType, nullable = false),
StructField("Balcao", StringType, nullable = false),
StructField("CriadoPor", StringType, nullable = false)
))
val df = spark.read
.format("com.crealytics.spark.excel")
.option("dataAddress", "teste!A1")
.option("mode", "FAILFAST")
.option("header", true)
.schema(myschema)
.load("/mnt/raw/externalfiles/TESTE.xlsx")
display(df)
Code should fail if I pass wrong/incorrect datetime in data column.
df.show()
Its not throwing exception even calling action.
.option("mode", "FAILFAST") Can you confirm does crealytics support .option(mode=<>) ?
Your Environment
Databricks Runtime Version ==> 10.1 (includes Apache Spark 3.2.0, Scala 2.12)
Libraries ==> com.crealytics:spark-excel_2.12:0.13.1
@girishjambkar the issue template is supposed to be filled out 😉
@girishjambkar the issue template is supposed to be filled out 😉
I have filled it now, thanks! Can you please confirm does crealytics support .option(mode=<>) like we can use CSV?
Version 1 (which you are using by specifying)
.format("com.crealytics.spark.excel")
does not support this.
Try
.format("excel")
File "/usr/lib/spark/python/lib/py4j-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o142.load.
: java.lang.ClassNotFoundException: Failed to find data source: excel. Please find packages at http://spark.apache.org/third-party-projects.html
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:657)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: excel.DefaultSource
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:634)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:634)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:634)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:634)
at scala.util.Try.orElse(Try.scala:84)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:634)
... 13 more
@nightscape I am unable to use .format("excel")
Getting above exception
You're using an outdated version, please upgrade.

I have tried running using .format("ëxcel"). Its failing because of above error. @nightscape
Databricks Runtime Version ==> 10.1 (includes Apache Spark 3.2.0, Scala 2.12)
Libraries ==> com.crealytics:spark-excel_2.13:3.2.0_0.16.0
Please help.
Could you expand the exception and post it here?
Error trace
NoClassDefFoundError: shadeio/commons/io/output/UnsynchronizedByteArrayOutputStream Caused by: ClassNotFoundException: shadeio.commons.io.output.UnsynchronizedByteArrayOutputStream at shadeio.poi.poifs.filesystem.FileMagic.valueOf(FileMagic.java:209) at shadeio.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:206) at shadeio.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:172) at com.crealytics.spark.excel.DefaultWorkbookReader.$anonfun$openWorkbook$1(WorkbookReader.scala:55) at scala.Option.fold(Option.scala:251) at com.crealytics.spark.excel.DefaultWorkbookReader.openWorkbook(WorkbookReader.scala:55) at com.crealytics.spark.excel.WorkbookReader.withWorkbook(WorkbookReader.scala:16) at com.crealytics.spark.excel.WorkbookReader.withWorkbook$(WorkbookReader.scala:15) at com.crealytics.spark.excel.DefaultWorkbookReader.withWorkbook(WorkbookReader.scala:50) at com.crealytics.spark.excel.ExcelRelation.excerpt$lzycompute(ExcelRelation.scala:32) at com.crealytics.spark.excel.ExcelRelation.excerpt(ExcelRelation.scala:32) at com.crealytics.spark.excel.ExcelRelation.headerColumns$lzycompute(ExcelRelation.scala:104) at com.crealytics.spark.excel.ExcelRelation.headerColumns(ExcelRelation.scala:103) at com.crealytics.spark.excel.ExcelRelation.$anonfun$inferSchema$1(ExcelRelation.scala:172) at scala.Option.getOrElse(Option.scala:189) at com.crealytics.spark.excel.ExcelRelation.inferSchema(ExcelRelation.scala:171) at com.crealytics.spark.excel.ExcelRelation.(ExcelRelation.scala:36) at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:36) at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:13) at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:8) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:385) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:390) at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:346) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:346) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:237) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw$$iw$$iw$$iw$$iw.(command-2541019815824441:3) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw$$iw$$iw$$iw.(command-2541019815824441:47) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw$$iw$$iw.(command-2541019815824441:49) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw$$iw.(command-2541019815824441:51) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw.(command-2541019815824441:53) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw.(command-2541019815824441:55) at $line03e3e0503061413eab90de3bf6be643427.$read.(command-2541019815824441:57) at $line03e3e0503061413eab90de3bf6be643427.$read$.(command-2541019815824441:61) at $line03e3e0503061413eab90de3bf6be643427.$read$.(command-2541019815824441) at $line03e3e0503061413eab90de3bf6be643427.$eval$.$print$lzycompute(:7) at $line03e3e0503061413eab90de3bf6be643427.$eval$.$print(:6) at $line03e3e0503061413eab90de3bf6be643427.$eval.$print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020) at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564) at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:219) at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:235) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:902) at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:855) at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:235) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$13(DriverLocal.scala:541) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:50) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:50) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:518) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:689) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:681) at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:522) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:634) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:427) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:370) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:221) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: shadeio.commons.io.output.UnsynchronizedByteArrayOutputStream at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:419) at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151) at java.lang.ClassLoader.loadClass(ClassLoader.java:352) at shadeio.poi.poifs.filesystem.FileMagic.valueOf(FileMagic.java:209) at shadeio.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:206) at shadeio.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:172) at com.crealytics.spark.excel.DefaultWorkbookReader.$anonfun$openWorkbook$1(WorkbookReader.scala:55) at scala.Option.fold(Option.scala:251) at com.crealytics.spark.excel.DefaultWorkbookReader.openWorkbook(WorkbookReader.scala:55) at com.crealytics.spark.excel.WorkbookReader.withWorkbook(WorkbookReader.scala:16) at com.crealytics.spark.excel.WorkbookReader.withWorkbook$(WorkbookReader.scala:15) at com.crealytics.spark.excel.DefaultWorkbookReader.withWorkbook(WorkbookReader.scala:50) at com.crealytics.spark.excel.ExcelRelation.excerpt$lzycompute(ExcelRelation.scala:32) at com.crealytics.spark.excel.ExcelRelation.excerpt(ExcelRelation.scala:32) at com.crealytics.spark.excel.ExcelRelation.headerColumns$lzycompute(ExcelRelation.scala:104) at com.crealytics.spark.excel.ExcelRelation.headerColumns(ExcelRelation.scala:103) at com.crealytics.spark.excel.ExcelRelation.$anonfun$inferSchema$1(ExcelRelation.scala:172) at scala.Option.getOrElse(Option.scala:189) at com.crealytics.spark.excel.ExcelRelation.inferSchema(ExcelRelation.scala:171) at com.crealytics.spark.excel.ExcelRelation.(ExcelRelation.scala:36) at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:36) at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:13) at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:8) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:385) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:390) at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:346) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:346) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:237) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw$$iw$$iw$$iw$$iw.(command-2541019815824441:3) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw$$iw$$iw$$iw.(command-2541019815824441:47) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw$$iw$$iw.(command-2541019815824441:49) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw$$iw.(command-2541019815824441:51) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw$$iw.(command-2541019815824441:53) at $line03e3e0503061413eab90de3bf6be643427.$read$$iw.(command-2541019815824441:55) at $line03e3e0503061413eab90de3bf6be643427.$read.(command-2541019815824441:57) at $line03e3e0503061413eab90de3bf6be643427.$read$.(command-2541019815824441:61) at $line03e3e0503061413eab90de3bf6be643427.$read$.(command-2541019815824441) at $line03e3e0503061413eab90de3bf6be643427.$eval$.$print$lzycompute(:7) at $line03e3e0503061413eab90de3bf6be643427.$eval$.$print(:6) at $line03e3e0503061413eab90de3bf6be643427.$eval.$print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020) at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564) at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:219) at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:235) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:902) at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:855) at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:235) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$13(DriverLocal.scala:541) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:50) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:50) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:518) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:689) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:681) at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:522) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:634) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:427) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:370) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:221) at java.lang.Thread.run(Thread.java:748)
@nightscape I am getting above exception in Databricks notebook, I can try to implement same in AWS EMR Cluster meanwhile, Can you please confirm does crealytics support .option(mode=<>) in upgraded version like .format("excel")?
Official documentation https://github.com/crealytics/spark-excel doesn't have anything like .option(mode=<>) !
Also it would be great if official documentation having some examples/sample to test same?
I could see similar problems while reading Custom Schema using crealytics :
https://stackoverflow.com/questions/70540400/spark-read-excel-not-reading-all-excel-rows-when-using-custom-schema/70565795#70565795
Hi @girishjambkar and @nightscape
Let me check the support of .option(mode, MODE) and get back to you. This looks like an issue in V2 implementation.
Hi @quanghgx I also get this issue running an Azure Databricks cluster 9.1 on Apache Spark 3.1.2 with com.crealytics:spark-excel_2.12:3.1.2_0.16.1-pre1
:

0.16.1-pre1
unfortunately is broken. I tried shading dependencies differently but made things worse.
0.16.1-pre2
would have been another shot at this, but the build is failing for a reason I don't understand yet.
For now, please try 0.16.0
.