ValidationException Not Thrown and Error Messages Not User-Friendly
Issue with Validation in frictionlessdata/datapackage-java
Context
I am using Java 21 and frictionlessdata/datapackage-java to validate a CSV file against a schema defined in a datapackage.json. Here is the setup:
CSV File
firstname,lastname,gender,age
John,Doe,male,30
Jane,Smith,female,25
Alice,Johnson,female,19
Bob,Williams,male,17
###datapackage.json (person.csv)
{ "name": "csv-validation-using-ig", "description": "Validates Person", "dialect": { "delimiter": "," }, "resources": [ { "name": "person_data", "path": "org/csv/person.csv", "schema": { "fields": [ { "name": "firstname", "type": "string", "description": "The first name of the person.", "constraints": { "required": true } }, { "name": "lastname", "type": "string", "description": "The last name of the person.", "constraints": { "required": true } }, { "name": "gender", "type": "string", "description": "Gender of the person. Valid values are 'male' or 'female'.", "constraints": { "enum": ["male", "female"] } }, { "name": "age", "type": "integer", "description": "The age of the person. Must be greater than 18.", "constraints": { "minimum": 19 } } ] } } ] }
###Junit Test
---
package csv;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertNotNull;
import static org.junit.jupiter.api.Assertions.assertThrows;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import org.junit.jupiter.api.Test;
import com.fasterxml.jackson.databind.ObjectMapper;
import io.frictionlessdata.datapackage.Package;
import io.frictionlessdata.tableschema.exception.ValidationException;
class PersonDataPackageValidationTest {
@Test
void validateDataPackage() throws Exception {
// Validate the datapackage.json using the new resource paths
ValidationException exception = assertThrows(ValidationException.class, () -> this.getDataPackageFromFilePath(
"org/csv/datapackage.json", true));
// Assert the validation messages
assertNotNull(exception.getMessages());
assertFalse(exception.getMessages().isEmpty());
}
public static Path getBasePath() {
try {
String pathName = "/src/test/resources/org/csv/datapackage.json";
Path sourceFileAbsPath = Paths.get(DataPackageValidationTest.class.getResource(pathName).toURI());
return sourceFileAbsPath.getParent();
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
private Package getDataPackageFromFilePath(String datapackageFilePath, boolean strict) throws Exception {
String jsonString = getFileContents(datapackageFilePath);
Package dp = new Package(jsonString, getBasePath(), strict);
return dp;
}
public String convertToJson(List<Object> validationMessages) {
try {
ObjectMapper objectMapper = new ObjectMapper();
return objectMapper.writerWithDefaultPrettyPrinter().writeValueAsString(validationMessages);
} catch (Exception e) {
throw new RuntimeException("Failed to convert to JSON", e);
}
}
private static String getFileContents(String fileName) {
try {
return new String(TestUtil.getResourceContent(fileName));
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
}
---
Issues and Questions
:question: Issue 1: Missing ValidationException for Age Rule
-
Description:
I have defined a rule in the schema that the age must be at least 19. In the CSV, the person Bob Williams has an age of 17.-
Expected Behavior: A
ValidationExceptionshould be thrown. - Actual Behavior: No exception is thrown.
-
Expected Behavior: A
-
Question:
Could you guide me on what might be wrong in my configuration or code?
:question: Issue 2: Lack of Specific Error Messages
-
Description:
Here is an example of the error messages I receive in another scenario:[{ "type" : "required", "code" : "1028", "path" : "$.fields[0]", "schemaPath" : "#/properties/fields/items/anyOf/0/required", "arguments" : [ "name" ], "details" : null, "message" : "$.fields[0].name: is missing but it is required" }] -
Description:
The error message does not indicate the specific field or value in the CSV that caused the issue. -
Comparison:
The Frictionless Python validator provides more detailed error messages, such as:{ "message": "The cell \"\" in row at position \"2\" and field \"firstname\" at position \"1\" does not conform to a constraint: constraint \"required\" is \"True\"" } -
Question:
How can I achieve similar specific error messages usingdatapackage-javato make the validation results more user-friendly?
:question: Request for Guidance
-
Questions:
- Kindly suggest how to fix the issue where the age rule is not triggering a
ValidationException. - Please guide me on how to configure or modify
datapackage-javato provide detailed error messages like the Frictionless Python validator.
- Kindly suggest how to fix the issue where the age rule is not triggering a
I appreciate any help or guidance you can provide.
---
Please preserve this line to notify @iSnow (lead of this repository)
@iSnow @amercader @akariv
Appreciate any help or guidance you can provide on the above issue.
Hi @Shreeja-dev
thank you for your feedback.
Issue 1: Missing ValidationException for Age Rule
The missing exception comes from a misunderstanding how validation works in the library. There are two validations:
- formal schema validation, this happens at the time you create a datapackage with a schema. In this step, the validity of the schema against the tableschema-spec is validated. No data validation occurs, therefore also no constraints validation
- validation of data, this happens only when you try to read data from the datapackage. The reason is that for a full data validation, the library would have to process all the data in the package, and again when you read the data. Therefore, this validation is deferred till you read the data.
We can rewrite your example so that it works:
void validateDataPackage() throws Exception {
Package dp = this.getDataPackageFromFilePath(
"/fixtures/datapackages/constraint-violation/datapackage.json", true);
Resource resource = dp.getResource("person_data");
ConstraintsException exception = assertThrows(ConstraintsException.class, () -> resource.getData(false, false, true, false));
// Assert the validation messages
Assertions.assertNotNull(exception.getMessage());
Assertions.assertFalse(exception.getMessage().isEmpty());
}
You can see that only during the resource.getData() call the exception will be thrown.
Issue 2: Lack of Specific Error Messages
This is true, but it is a problem of the networknt validator library we are using to do formal schema validation.
For data validation, I took some steps to make the exceptions more user-friendly.
Hope that helps.
I am having no success with validation, too. I have tried your example. However, no Exception is thrown.
public static void main(String[] args) throws Exception {
String jsonString = org.apache.commons.io.FileUtils.readFileToString(new File("/tmp/datapackage-java/src/test/resources/fixtures/datapackages/constraint-violation/datapackage.json"), Charset.defaultCharset());
Package dp = new Package(jsonString, Path.of("/tmp/datapackage-java/src/test/resources/fixtures/"), true);
Resource resource = dp.getResource("person_data");
resource.getData(false, false, false, false);
}
There is the list of errors: https://github.com/frictionlessdata/datapackage-java/blob/7354d2dd0dd5b24f0a5f4eb3557440b3604e8f47/src/main/java/io/frictionlessdata/datapackage/resource/AbstractResource.java#L71
It is filled in the catch block of the validate method:
https://github.com/frictionlessdata/datapackage-java/blob/7354d2dd0dd5b24f0a5f4eb3557440b3604e8f47/src/main/java/io/frictionlessdata/datapackage/resource/AbstractResource.java#L476-L483
However, I could not find a place where the list of errors is read.