[Kernel] Test `DefaultJsonHandler` for empty JSON files
Which Delta project/connector is this regarding?
- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [X] Kernel
- [ ] Other (fill in here)
Overview
#2211 adds a test for reading empty parquet files (and fixes a bug in our DefaultParquetReader). We should test the same for the DefaultJsonHandler.
Further details
We should test DefaultJsonHandler.readJsonFiles(...) with a file-list that includes empty JSON files in the middle of the list.
Hi @allisonport-db,
Is this ticket still open? If so I would like to help on this. 😄
@fernandocast are you still interested in working on this issue? Let me know if you have any questions.
@fernandocast are you still interested in working on this issue? Let me know if you have any questions.
Yes, I'm still interested. Sorry, I have been kind of busy but I'm gonna start this week.
@fernandocast are you still interested in working on this issue? Let me know if you have any questions.
One question @vkorukanti, I noticed that the method DefaultJsonHandler.readJsonFiles doesn't works well if the list of files contains and empty file in the middle. Is this the expected behavior ? or should I fix this and also perform the tests?
Example of current behaviour:
list of json files 1.json -> {non_empty_json} 1_empty.json -> empty 2.json -> {non_empty_json}
in this case the current code just read data from 1.json and stops when it found the empty file.
@fernandocast Yep, it is a bug. You can fix it by calling the hasNext recursively within the current hasNext implementation. Please fix the bug along with the tests.