dps icon indicating copy to clipboard operation
dps copied to clipboard

[ja] replace Japanese PII

Open fujiki-1emon opened this issue 1 year ago • 0 comments

Background

  • Seems that we don't have to implement a lot of pre-processing to replace Japanese PII
    • because there are already some PII pre-processing in language agnostic processing.
  • But we might need to implement additionally to replace some Japanese specific PII.

TODOs

  • [ ] replace Japanese phone number
    • But we might be able to just re-use the Korean one
      • cf. https://github.com/EleutherAI/dps/blob/master/dps/spark/utils/korean_utils.py#L182
  • [ ] replace Japanese bank account number

fujiki-1emon avatar Mar 13 '23 00:03 fujiki-1emon