dps
dps copied to clipboard
[ja] replace Japanese PII
Background
- Seems that we don't have to implement a lot of pre-processing to replace Japanese PII
- because there are already some PII pre-processing in language agnostic processing.
- But we might need to implement additionally to replace some Japanese specific PII.
TODOs
- [ ] replace Japanese phone number
- But we might be able to just re-use the Korean one
- cf. https://github.com/EleutherAI/dps/blob/master/dps/spark/utils/korean_utils.py#L182
- But we might be able to just re-use the Korean one
- [ ] replace Japanese bank account number