pxf
pxf copied to clipboard
Refactor JsonRecordReader and fix incorrect parsing of JSON objects for multi-line JSON
This PR is a continuation of ttps://github.com/greenplum-db/pxf/pull/858.
This commit refactors the JsonRecordReader to internally use the LineRecordReader when handling multi-line JSON files. This is done to avoid incorrectly parsing JSON objects, especially those that contain special characters.
It adds logic for a new table parameter "USE_PARALLEL_READ" which allows the users to toggle between the default HdfsDataFragmenter (true) and the HdfsFileFragmenter (false).