presidio
presidio copied to clipboard
Add Presidio CLI into the Presidio repo
As agreed with the folks who developed Presidio CLI and did an amazing job on it, we'd like to integrate it into Presidio.
Tagging @dinakar29 and @knightdave. Let's use this issue to discuss the details.
FYI @omri374 - we are ready to propose an ADR here. We will post details soon. Cheers.
Integrate Presidio CLI into Presidio
Status
Proposed
Context
Presidio is available for use as a Python package, and as a Docker image. Presidio CLI has been developed separately from Presidio. Presidio CLI lets users run Presidio analyzer from command line interface.
Decision
The proposed change is to integrate the Presidio CLI repository into Presidio
repository as a top-level directory (along directories such as presidio-analyzer, presidio-anonymizer, presidio-image-redactor). This change is reflected in the diagram as Phase 1.
In the future, during Phase 2, it is possible to expand the functionality of the CLI to allow using presidio-anonymizer, and presidio-image-redactor.
Implementation timeline for the phases:
- Phase 1 - Q3 2022
- Phase 2 - Q4 2022
The documentation will have to be updated to describe the possibility of using Presidio with CLI.
Consequences
By integrating the repositories, code can be more easily managed.
Users will have a simpler way to use Presidio with CLI.
Personas who will benefit from the proposed architecture include:
- DevOps teams, SREs
- Data Analysts
- Data Privacy Officers with minimal programming experience
C4Context
title System Context diagram for Presidio CLI
Person(user, "User")
System(presidio, "Presidio")
Rel(user, presidio, "Uses")
C4Context
title Container diagram for Presidio CLI
Person(user, "User")
Boundary(b1, "Presidio", "") {
Boundary(b2, "Phase 1", "") {
Container(presidiocli, "Presidio CLI")
Container(presidioanalyzer, "Presidio analyzer")
}
Container(presidioanonymizer, "Presidio anonymizer")
Container(presidioimageredactor, "Presidio image redactor")
}
Rel(user, presidioanalyzer, "Uses")
Rel(user, presidiocli, "Uses")
Rel(user, presidioanonymizer, "Uses")
Rel(user, presidioimageredactor, "Uses")
Rel(presidiocli, presidioanalyzer, "Uses")
C4Context
title Component diagram for Presidio CLI
Person(user, "User")
Boundary(b1, "Presidio", "") {
Boundary(b3, "Phase 2", "") {
Boundary(b2, "Phase 1", "") {
Container(presidiocli, "Presidio CLI")
Container_Boundary(b4, "analyzer", "") {
Component(presidioanalyzer, "Presidio analyzer")
Component(presidioanalyzercontainer, "Presidio analyzer container")
Component(presidioanalyzerlibrary, "Presidio analyzer library")
}
}
Container_Boundary(b5, "anonymizer", "") {
Component(presidioanonymizer, "Presidio anonymizer")
Component(presidioanonymizercontainer, "Presidio anonymizer container")
Component(presidioanonymizerlibrary, "Presidio anonymizer library")
}
Container_Boundary(b6, "imageredactor", "") {
Component(presidioimageredactor, "Presidio image redactor")
Component(presidioimageredactorcontainer, "Presidio image redactor container")
Component(presidioimageredactorlibrary, "Presidio image redactor library")
}
}
}
Rel(user, presidiocli, "Uses")
Rel(user, presidioanalyzercontainer, "API")
Rel(user, presidioanonymizercontainer, "API")
Rel(user, presidioimageredactorcontainer, "API")
Rel(presidioanalyzercontainer, presidioanalyzer, "")
Rel(presidioanonymizercontainer, presidioanonymizer, "")
Rel(presidioimageredactorcontainer, presidioimageredactor, "")
Rel(presidioanalyzerlibrary, presidioanalyzer, "")
Rel(presidioanonymizerlibrary, presidioanonymizer, "")
Rel(presidioimageredactorlibrary, presidioimageredactor, "")
Rel(presidiocli, presidioanalyzerlibrary, "")
Rel(presidiocli, presidioanonymizerlibrary, "")
Rel(presidiocli, presidioimageredactorlibrary, "")
Thank you @walkowif, this is great. So Phase 1 would allow the detection of entities, but not their de-identification, correct?
Thank you @walkowif, this is great. So Phase 1 would allow the detection of entities, but not their de-identification, correct?
@omri374 - that is correct. Phase 1 will be a direct transfer of the CLI with existing functionality to make sure that at least one core Presidio module works properly with the CLI integration.
Following successful implementation of Phase 1, Phase 2 will include implementation of obfuscation and image text redaction capabilities from the remaining modules.
Sounds good, thanks!