spark-excel
spark-excel copied to clipboard
Data is getting changed when a column have multiple datatypes
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
I am trying to read data from a column which have multiple datatypes, both integer, and decimal. The decimal values are rounded off to some digits only. I am passing inferSchema as false (Eg: 284.259235897532 to 284.2592359 ), If I pass inferSchema as true, integers are being converted into decimals.
df: DataFrame = ( self.spark.read.format("com.crealytics.spark.excel") .option("dataAddress", data_address) .option("header", "true") .option("inferSchema", "false") #! important .option("usePlainNumberFormat", "false") #! important .option("maxRowsInMemory", "50") .load(f"s3a://{self.bucket}/{self.excel_file}") )
Expected Behavior
When we read the data using spark excel, data should not change even though column have multiple datatypes
Steps To Reproduce
No response
Environment
- Spark version:2.4.7
- Spark-Excel version:2.11
- OS: ubuntu
- Cluster environment
Anything else?
No response
We're using the getNumericCellValue
method to read data from a cell using POI. Unfortunately, I'm not aware of any other way to read numeric values into a higher-precision format...