cudf
cudf copied to clipboard
JSON reader validation of values
trafficstars
Description
Addresses part of https://github.com/rapidsai/cudf/issues/15222 This change adds validation stage in JSON reader at tokens level. If any validation fails in a row, it will make the entire row as null.
- [x] validation functor - implement spark validation rules. (@revans2 implemented all validation rules)
- [x] move output iterator to thrust. (already merged by https://github.com/NVIDIA/cccl/pull/2282)
- [x] Fix failing tests and infer data type for Float.
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.
I am seeing two test failures around NBSP in a quoted string. I need to do some more debugging to see if it is my code changes or yours that are causing the problem.
Pushed my first review round. Will come back later. Thanks for working on this :)
/merge