spark-excel icon indicating copy to clipboard operation
spark-excel copied to clipboard

Logs with Warning for Currency format

Open EnverOsmanov opened this issue 2 years ago • 3 comments

TLDR: Let's release new version of spark-excel. It has POI 5.


I have a file where some cells have Currency format. I can read it with spark-excel, but it also shows warnings.

Expected Behavior

Proper file should show no warnings.

Current Behavior

Has such warning:

WARN CellFormatter: Invalid format: "_([$€-2]\ * #,##0.00_);

21/11/16 15:40:57 WARN CellFormatter: Invalid format: "_([$€-2]\ * #,##0.00_);"
java.lang.IllegalArgumentException: Unsupported [] format block '[' in '_([$€-2]\ * #,##0.00_)' with c2: null
	at shadeio.poi.ss.format.CellFormatPart.formatType(CellFormatPart.java:373)
	at shadeio.poi.ss.format.CellFormatPart.getCellFormatType(CellFormatPart.java:287)
	at shadeio.poi.ss.format.CellFormatPart.<init>(CellFormatPart.java:191)
	at shadeio.poi.ss.format.CellFormat.<init>(CellFormat.java:189)
	at shadeio.poi.ss.format.CellFormat.getInstance(CellFormat.java:163)
	at shadeio.poi.ss.usermodel.DataFormatter.getFormat(DataFormatter.java:343)
	at shadeio.poi.ss.usermodel.DataFormatter.getFormat(DataFormatter.java:309)
	at shadeio.poi.ss.usermodel.DataFormatter.getFormattedNumberString(DataFormatter.java:868)
	at shadeio.poi.ss.usermodel.DataFormatter.formatCellValue(DataFormatter.java:1021)
	at shadeio.poi.ss.usermodel.DataFormatter.formatCellValue(DataFormatter.java:971)
	at shadeio.poi.ss.usermodel.DataFormatter.formatCellValue(DataFormatter.java:950)
	at com.crealytics.spark.excel.HeaderDataColumn.stringValue$lzycompute$1(DataColumn.scala:58)
	at com.crealytics.spark.excel.HeaderDataColumn.stringValue$1(DataColumn.scala:48)
	at com.crealytics.spark.excel.HeaderDataColumn.extractValue(DataColumn.scala:105)
	at com.crealytics.spark.excel.DataColumn.apply(DataColumn.scala:17)

Possible Solution

I've tested with the latest (unreleased) code of spark-excel and it has no warnings. I suppose because it is using POI 5.

Context

The LibreOffice shows for Currency format such format code "([$€-2] * #,##0.00);([$€-2] * (#,##0.00);([$€-2] * "-"??);(@_)" One more interesting detail. Original value in the file is "172000000". When I read the file with spark-excel 0.14.0 the cell value becomes : "172000000" When I read the file with spark-excel from main branch the cell value becomes: "€ 172,000,000.00"

Your Environment

spark-excel 0.14.0

EnverOsmanov avatar Nov 16 '21 14:11 EnverOsmanov

Release 0.15.0 running: https://github.com/crealytics/spark-excel/releases/tag/v0.15.0

nightscape avatar Nov 16 '21 14:11 nightscape

I was testing today the new version. Here are the issues I found and small notes:

  1. The latest version (3.1.2_0.15.0) doesn't shade apache POI. Probably because spark-excel has version 5.0.0 and other dependencies such as excel-streaming-reader, spoiwo has POI 5.1.0. I did upgrade spark-excel to 5.1.0 and published locally - apache POI was successfully shaded to final jar.
sbt assembly //or
sbt publishM2
  1. Now I have other issue - spark-excel struggles to find workbook providers:
Your InputStream was neither an OLE2 stream, nor an OOXML stream or you haven't provide the poi-ooxml*.jar in the classpath/modulepath - FileMagic: OOXML, having providers: []

Your InputStream was neither an OLE2 stream, nor an OOXML stream or you haven't provide the poi-ooxml*.jar in the classpath/modulepath - FileMagic: OOXML, having providers: []
java.io.IOException: Your InputStream was neither an OLE2 stream, nor an OOXML stream or you haven't provide the poi-ooxml*.jar in the classpath/modulepath - FileMagic: OOXML, having providers: []
	at shadeio.poi.ss.usermodel.WorkbookFactory.wp(WorkbookFactory.java:309)
	at shadeio.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:208)
	at shadeio.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:172)
	at com.crealytics.spark.excel.DefaultWorkbookReader.$anonfun$openWorkbook$1(WorkbookReader.scala:49)
	at scala.Option.fold(Option.scala:251)
	at com.crealytics.spark.excel.DefaultWorkbookReader.openWorkbook(WorkbookReader.scala:49)

I think it is because POI 5 uses ServiceLoader to find WorkbookProviders. They are listed in shaded jar in the folder "META-INF/services".

org.apache.poi.xslf.usermodel.XSLFSlideShowFactory
..
org.apache.poi.xssf.usermodel.XSSFWorkbookFactory
org.apache.poi.hssf.usermodel.HSSFWorkbookFactory

probably they should be shaded as well. I will continue looking to this.

EnverOsmanov avatar Nov 17 '21 15:11 EnverOsmanov

(I will leave my note here, maybe it would be useful).

I did try 3.1.2_0.15.1 (#465 ), but my program still failed.

java.lang.NoClassDefFoundError: shadeio/poi/schemas/vmldrawing/XmlDocument

java.lang.NoClassDefFoundError: shadeio/poi/schemas/vmldrawing/XmlDocument
	at shadeio.poi.xssf.usermodel.XSSFVMLDrawing.read(XSSFVMLDrawing.java:132)
	at shadeio.poi.xssf.usermodel.XSSFVMLDrawing.<init>(XSSFVMLDrawing.java:121)
	at shadeio.poi.ooxml.POIXMLFactory.createDocumentPart(POIXMLFactory.java:61)
	at shadeio.poi.ooxml.POIXMLDocumentPart.read(POIXMLDocumentPart.java:661)
	at shadeio.poi.ooxml.POIXMLDocumentPart.read(POIXMLDocumentPart.java:678)
	at shadeio.poi.ooxml.POIXMLDocument.load(POIXMLDocument.java:165)
	at shadeio.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:275)
	at shadeio.poi.xssf.usermodel.XSSFWorkbookFactory.createWorkbook(XSSFWorkbookFactory.java:118)
I found that class in poi-ooxml-lite, so I have added the new dependency.

After this I was getting new error:

org.apache.poi.schemas.vmldrawing.impl.XmlDocumentImpl cannot be cast to shadeio.poi.schemas.vmldrawing.XmlDocument

java.lang.ClassCastException: org.apache.poi.schemas.vmldrawing.impl.XmlDocumentImpl cannot be cast to shadeio.poi.schemas.vmldrawing.XmlDocument
	at shadeio.poi.xssf.usermodel.XSSFVMLDrawing.read(XSSFVMLDrawing.java:144)
	at shadeio.poi.xssf.usermodel.XSSFVMLDrawing.<init>(XSSFVMLDrawing.java:121)
	at shadeio.poi.ooxml.POIXMLFactory.createDocumentPart(POIXMLFactory.java:61)
	at shadeio.poi.ooxml.POIXMLDocumentPart.read(POIXMLDocumentPart.java:661)
	at shadeio.poi.ooxml.POIXMLDocumentPart.read(POIXMLDocumentPart.java:678)
	at shadeio.poi.ooxml.POIXMLDocument.load(POIXMLDocument.java:165)
	at shadeio.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:275)
	at shadeio.poi.xssf.usermodel.XSSFWorkbookFactory.createWorkbook(XSSFWorkbookFactory.java:118)

I learned that XmlDocumentImpl is being created using xml2eb5doctype.xsb file which contains full unshaded path to XmlDocumentImpl.

EnverOsmanov avatar Dec 01 '21 23:12 EnverOsmanov