delta icon indicating copy to clipboard operation
delta copied to clipboard

[Kernel] Test `DefaultJsonHandler` for empty JSON files

Open allisonport-db opened this issue 2 years ago • 5 comments

Which Delta project/connector is this regarding?

  • [ ] Spark
  • [ ] Standalone
  • [ ] Flink
  • [X] Kernel
  • [ ] Other (fill in here)

Overview

#2211 adds a test for reading empty parquet files (and fixes a bug in our DefaultParquetReader). We should test the same for the DefaultJsonHandler.

Further details

We should test DefaultJsonHandler.readJsonFiles(...) with a file-list that includes empty JSON files in the middle of the list.

allisonport-db avatar Oct 30 '23 19:10 allisonport-db

Hi @allisonport-db,

Is this ticket still open? If so I would like to help on this. 😄

fernandocast avatar Nov 25 '23 07:11 fernandocast

@fernandocast are you still interested in working on this issue? Let me know if you have any questions.

vkorukanti avatar Feb 14 '24 22:02 vkorukanti

@fernandocast are you still interested in working on this issue? Let me know if you have any questions.

Yes, I'm still interested. Sorry, I have been kind of busy but I'm gonna start this week.

fernandocast avatar Mar 04 '24 06:03 fernandocast

@fernandocast are you still interested in working on this issue? Let me know if you have any questions.

One question @vkorukanti, I noticed that the method DefaultJsonHandler.readJsonFiles doesn't works well if the list of files contains and empty file in the middle. Is this the expected behavior ? or should I fix this and also perform the tests?

Example of current behaviour:

list of json files 1.json -> {non_empty_json} 1_empty.json -> empty 2.json -> {non_empty_json}

in this case the current code just read data from 1.json and stops when it found the empty file.

fernandocast avatar Mar 14 '24 05:03 fernandocast

@fernandocast Yep, it is a bug. You can fix it by calling the hasNext recursively within the current hasNext implementation. Please fix the bug along with the tests.

vkorukanti avatar Mar 16 '24 00:03 vkorukanti