parquet-java
parquet-java copied to clipboard
Data obfuscation layer for encryption
Data obfuscation in sensitive columns - for users without access to column encryption keys.
- Implement on top of basic Parquet encryption
- Built-in support for multiple masking mechanisms, with different trade-off between data utility, leakage, and size/throughput overhead
- Provide interface for plug-in custom masking mechanism
- Enable storing multiple masked versions of the same column in a file
- Provide readers with explicit list of column’s masked versions in a file
- Enable readers to select a masked version of a column
- Stretch: Implement tools for analysis of file data privacy properties and information leakage
- Stretch: Leverage privacy analysis tools for tuning file data anonymity
- Optional: Support aggregated obfuscation
Reporter: Gidon Gershinsky / @ggershinsky Assignee: Gidon Gershinsky / @ggershinsky
Related issues:
- Parquet modular encryption (depends upon)
PRs and other links:
Note: This issue was originally created as PARQUET-1376. Please see the migration documentation for further details.