JSON-java
JSON-java copied to clipboard
JSON CDL is omitting last row when parsing CSV with an empty last column
I was playing with parsing CSV files into JSON using CDL library from org.json.CDL and I observed a strange behavior.
When I pass a CSV with an empty last value to the toJSONArray method, it's omitting the last row.
package com.my.test;
import org.json.CDL;
import org.json.JSONException;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.mockito.junit.jupiter.MockitoExtension;
import java.util.List;
import static org.junit.jupiter.api.Assertions.*;
@ExtendWith(MockitoExtension.class)
class CDLTest {
@Test
void testCDL_empty_column_fail() throws JSONException {
final String SIMPLE_CSV_EMPTY_COLUMN_STRING = "Number,Name,Empty\n" +
"1,Allice,\n" +
"2,Chris,\n" +
"3,Bob,";
List<Object> jsonObjects = CDL.toJSONArray(SIMPLE_CSV_EMPTY_COLUMN_STRING).toList();
assertAll(
() -> assertNotNull(jsonObjects),
() -> assertEquals(3, jsonObjects.size())
);
}
}
It's failing as well for that String:
final String SIMPLE_CSV_EMPTY_COLUMN_STRING = "Number,Name,Empty\n" +
"1,Allice,not\n" +
"2,Chris,empty\n" +
"3,Bob,";
Do you have any idea how can I fix it? For learning purposes, I'd prefer to stick to this library and just fix it codewise. I was thinking about extending CDL and reimplementing method with a bug. Do you have any other idea, or maybe someone already fixed it?
Versions:
<org.json.version>20190722</org.json.version> and
Java 11
is it an empty "last value" or is it the missing new line \n causing the issue?
For example, does the following data work:
final String SIMPLE_CSV_EMPTY_COLUMN_STRING = "Number,Name,Empty\n" +
"1,Allice,\n" +
"2,Chris,\n" +
"3,Bob,\n";
Hi @johnjaylward, When I add a new line CDL parses correctly, but CSV not necessarily needs to end with a new line, doesn't it?
The reason for this behavior is line 108 in CDL rowToJSONArray which assumes that the row needs something at the end:
if (value == null ||
(ja.length() == 0 && value.length() == 0 && c != ',')) {
I'm not sure about CSV specifically, but here is a good answer for text files in general: https://unix.stackexchange.com/a/18789
looks like CSV itself is lose on the ending new-line: https://www.csvreader.com/csv_format.php
Feel free to submit a PR for a fix
Feel free to submit a PR for a fix
The JavaDoc for the CDL class does specify that every row ends with a NEWLINE. So if you do make a change, please be sure to update any relevant JavaDoc as well.
I found as well that the test emptyLinesToJSONArray doesn't check anything because last row in str is omitted so it ends up being a case of a CSV with only header row = empty JSON:
/**
* Create a JSONArray from string containing only whitespace and commas
*/
@Test
public void emptyLinesToJSONArray() {
String str = " , , , \n , , , ";
JSONArray jsonArray = CDL.toJSONArray(str);
assertNull("JSONArray should be null for no content",
jsonArray);
}
The correct test should have at least 3 rows and yes it's failing so this functionality was never implemented.
I think I need some access rights to create a branch:
ERROR: Permission to stleary/JSON-java.git denied to jjanczur. Could not read from remote repository. Please make sure you have the correct access rights and the repository exists
Quick fix was to add additional check if the ja is not nul:
if ((value == null && ja.length() == 0) ||
(ja.length() == 0 &&
value != null &&
value.length() == 0 &&
c != ',')) {
instead of:
if (value == null ||
(ja.length() == 0 && value.length() == 0 && c != ',')) {
I found as well that the test emptyLinesToJSONArray doesn't check anything because last row in str is omitted so it ends up being a case of a CSV with only header row = empty JSON
This is actually validating the case that you initially brought up in this ticket. That a "row" with no \n at the end doesn't get processed. The test is currently correct as expected behaviour of the CDL class.
I think I need some access rights to create a branch:
ERROR: Permission to stleary/JSON-java.git denied to jjanczur. Could not read from remote repository. Please make sure you have the correct access rights and the repository exists
The standard model in GitHub is to create a Fork
of a project, make changes in your own branch(es), then submit a PR from your fork.
Closing due to lack of activity. If you think it should remain open, please post here.