spark-excel icon indicating copy to clipboard operation
spark-excel copied to clipboard

Data is getting changed when a column have multiple datatypes

Open shanmukha-albanero opened this issue 1 year ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

I am trying to read data from a column which have multiple datatypes, both integer, and decimal. The decimal values are rounded off to some digits only. I am passing inferSchema as false (Eg: 284.259235897532 to 284.2592359 ), If I pass inferSchema as true, integers are being converted into decimals.

df: DataFrame = ( self.spark.read.format("com.crealytics.spark.excel") .option("dataAddress", data_address) .option("header", "true") .option("inferSchema", "false") #! important .option("usePlainNumberFormat", "false") #! important .option("maxRowsInMemory", "50") .load(f"s3a://{self.bucket}/{self.excel_file}") )

Expected Behavior

When we read the data using spark excel, data should not change even though column have multiple datatypes

Steps To Reproduce

No response

Environment

- Spark version:2.4.7
- Spark-Excel version:2.11
- OS: ubuntu
- Cluster environment

Anything else?

No response

shanmukha-albanero avatar Mar 16 '23 09:03 shanmukha-albanero

We're using the getNumericCellValue method to read data from a cell using POI. Unfortunately, I'm not aware of any other way to read numeric values into a higher-precision format...

nightscape avatar Mar 16 '23 23:03 nightscape