cudf icon indicating copy to clipboard operation
cudf copied to clipboard

API for JSON unquoted whitespace normalization

Open shrshi opened this issue 1 year ago • 0 comments

Description

This work is a follow-up to PR #14931 which provided a proof-of-concept for using the a FST to normalize unquoted whitespaces. This PR implements the pre-processing FST in cuIO and adds a JSON reader option that needs to be set to true to invoke the normalizer. Addresses feature request #14865

Checklist

  • [X] I am familiar with the Contributing Guidelines.
  • [X] New or existing tests cover these changes.
  • [ ] The documentation is up to date with these changes.

shrshi avatar Feb 13 '24 00:02 shrshi