[BUG] Schema is not getting merged while reading multiple files with different schema
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
I am trying to read multiple files inside a folder. While printing the schema dataframe is only giving first excel file schema. with csv's mergeSchema option will merge all files schema. with 'excel' do we have any option to merge multiple files schema.
This is how I am reading a folder.
Expected Behavior
Schema of files inside a folder should me merged
Steps To Reproduce
Keep two files with different columns inside a folder and try reading it
Environment
- Spark version:3.3.0
- Spark-Excel version:com.crealytics:spark-excel_2.12:0.14.0
- OS:windows
- Cluster environment:11.3LTS
Anything else?
No response
Does that combination of Spark and spark-excel even work?? Please always try the newest version of spark-excel when posting issues. I don't think this solves the problem in this case, but makes the life of the maintainers much easier when we know that an issue is present in the newest version.
I see the same issue for below versions. inferschema is true, 2 date partitions, one of them has an extra column and the column is not showing up in schema @nightscape
spark-excel_2.12_0.18.5
spark 3.2.2
Same results with
spark-excel_2.13_0.18.7
spark 3.3.1