presidio icon indicating copy to clipboard operation
presidio copied to clipboard

Add Presidio CLI into the Presidio repo

Open omri374 opened this issue 3 years ago • 5 comments

As agreed with the folks who developed Presidio CLI and did an amazing job on it, we'd like to integrate it into Presidio.

Tagging @dinakar29 and @knightdave. Let's use this issue to discuss the details.

omri374 avatar Jun 28 '22 10:06 omri374

FYI @omri374 - we are ready to propose an ADR here. We will post details soon. Cheers.

cicdguy avatar Sep 06 '22 12:09 cicdguy

Integrate Presidio CLI into Presidio

Status

Proposed

Context

Presidio is available for use as a Python package, and as a Docker image. Presidio CLI has been developed separately from Presidio. Presidio CLI lets users run Presidio analyzer from command line interface.

Decision

The proposed change is to integrate the Presidio CLI repository into Presidio repository as a top-level directory (along directories such as presidio-analyzer, presidio-anonymizer, presidio-image-redactor). This change is reflected in the diagram as Phase 1.

In the future, during Phase 2, it is possible to expand the functionality of the CLI to allow using presidio-anonymizer, and presidio-image-redactor.

Implementation timeline for the phases:

  • Phase 1 - Q3 2022
  • Phase 2 - Q4 2022

The documentation will have to be updated to describe the possibility of using Presidio with CLI.

Consequences

By integrating the repositories, code can be more easily managed.

Users will have a simpler way to use Presidio with CLI.

Personas who will benefit from the proposed architecture include:

  • DevOps teams, SREs
  • Data Analysts
  • Data Privacy Officers with minimal programming experience
C4Context
      title System Context diagram for Presidio CLI
      Person(user, "User")
     System(presidio, "Presidio")
      Rel(user, presidio, "Uses")
C4Context
      title Container diagram for Presidio CLI
      Person(user, "User")
      Boundary(b1, "Presidio", "") {
      Boundary(b2, "Phase 1", "") {
           Container(presidiocli, "Presidio CLI")
           Container(presidioanalyzer, "Presidio analyzer")
      }
     Container(presidioanonymizer, "Presidio anonymizer")
     Container(presidioimageredactor, "Presidio image redactor")
    }
      Rel(user, presidioanalyzer, "Uses")
     Rel(user, presidiocli, "Uses")
     Rel(user, presidioanonymizer, "Uses")
     Rel(user, presidioimageredactor, "Uses")
     Rel(presidiocli, presidioanalyzer, "Uses")
C4Context
      title Component diagram for Presidio CLI
      Person(user, "User")
      Boundary(b1, "Presidio", "") {
      Boundary(b3, "Phase 2", "") {
          Boundary(b2, "Phase 1", "") {
               Container(presidiocli, "Presidio CLI")
               Container_Boundary(b4, "analyzer", "") {
                   Component(presidioanalyzer, "Presidio analyzer")
                   Component(presidioanalyzercontainer, "Presidio analyzer container")
                   Component(presidioanalyzerlibrary, "Presidio analyzer library")
               }
          }
          Container_Boundary(b5, "anonymizer", "") {
               Component(presidioanonymizer, "Presidio anonymizer")
               Component(presidioanonymizercontainer, "Presidio anonymizer container")
               Component(presidioanonymizerlibrary, "Presidio anonymizer library")
           }
           Container_Boundary(b6, "imageredactor", "") {
               Component(presidioimageredactor, "Presidio image redactor")
               Component(presidioimageredactorcontainer, "Presidio image redactor container")
               Component(presidioimageredactorlibrary, "Presidio image redactor library")
           }
        }
    }
    Rel(user, presidiocli, "Uses")
    Rel(user, presidioanalyzercontainer, "API")
    Rel(user, presidioanonymizercontainer, "API")
    Rel(user, presidioimageredactorcontainer, "API")
    Rel(presidioanalyzercontainer, presidioanalyzer, "")
    Rel(presidioanonymizercontainer, presidioanonymizer, "")
    Rel(presidioimageredactorcontainer, presidioimageredactor, "")
    Rel(presidioanalyzerlibrary, presidioanalyzer, "")
    Rel(presidioanonymizerlibrary, presidioanonymizer, "")
    Rel(presidioimageredactorlibrary, presidioimageredactor, "")
    Rel(presidiocli, presidioanalyzerlibrary, "")
    Rel(presidiocli, presidioanonymizerlibrary, "")
    Rel(presidiocli, presidioimageredactorlibrary, "")

walkowif avatar Sep 09 '22 12:09 walkowif

Thank you @walkowif, this is great. So Phase 1 would allow the detection of entities, but not their de-identification, correct?

omri374 avatar Sep 11 '22 12:09 omri374

Thank you @walkowif, this is great. So Phase 1 would allow the detection of entities, but not their de-identification, correct?

@omri374 - that is correct. Phase 1 will be a direct transfer of the CLI with existing functionality to make sure that at least one core Presidio module works properly with the CLI integration.

Following successful implementation of Phase 1, Phase 2 will include implementation of obfuscation and image text redaction capabilities from the remaining modules.

cicdguy avatar Sep 11 '22 17:09 cicdguy

Sounds good, thanks!

omri374 avatar Sep 11 '22 17:09 omri374