deid icon indicating copy to clipboard operation
deid copied to clipboard

Add "HIPAA-safe Quickstart" and clarify --exclude usage

Open rdlasco opened this issue 4 months ago • 3 comments

Please add a short “HIPAA-safe Quickstart” section to the README so new users can try de-identification without using real PHI. The note should point to the deid-data package (synthetic DICOM files), show how to run deid --help (or python -m deid --help), include a very small Python example that opens one synthetic file, and explain that --exclude accepts a comma-separated list (for example: build,.git,venv). This is a documentation-only change; I’m happy to open the pull request. The goal is to reduce the chance of handling real PHI by mistake and make first-time setup clearer.

rdlasco avatar Sep 11 '25 15:09 rdlasco

I'd be happy to review a PR that makes these changes.

vsoch avatar Sep 11 '25 16:09 vsoch

Opened PR #288 to implement this: adds a “HIPAA-safe Quickstart” section using synthetic data (no PHI). Happy to revise based on feedback.

rdlasco avatar Sep 12 '25 15:09 rdlasco

Hi all, just wanted to make sure this project was aware of these datasets we host on TCIA that contain a bunch of realistic synthetic data meant for testing de-id tool performance in case it's useful:

Rutherford, M., Mun, S.K., Levine, B., Bennett, W.C., Smith, K., Farmer, P., Jarosz, J., Wagner, U., Farahani, K., Prior, F. (2021). A DICOM dataset for evaluation of medical image de-identification (Pseudo-PHI-DICOM-Data) [Data set]. The Cancer Imaging Archive. DOI: https://doi.org/10.7937/s17z-r072

and

 Rutherford, M. W., Nolan, T., Pei, L., Wagner, U., Pan, Q., Farmer, P., Smith, K., Kopchick, B., Laura Opsahl-Ong, Sutton, G., Clunie, D. A., Farahani, K., & Prior, F. (2025). Data in Support of the MIDI-B Challenge (MIDI-B-Synthetic-Validation, MIDI-B-Curated-Validation, MIDI-B-Synthetic-Test, MIDI-B-Curated-Test) (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/cf2p-aw56

You would be free to include as much of it as you want in your test data package as long as you include this citation in your docs for attribution.

kirbyju avatar Sep 19 '25 01:09 kirbyju