DataflowTemplates icon indicating copy to clipboard operation
DataflowTemplates copied to clipboard

Data protection template using Protegrity

Open ilya-kozyrev opened this issue 4 years ago • 41 comments

Some users may want to protect their sensitive data using tokenization. We propose to create a template that will provide integration with protection RPC server using Beam transform to protect sensitive data using tokenization.

At a high level, a template that will:

  • support batch (GCS) and streaming (Pub/Sub) input sources
  • tokenize sensitive data via external RPC service - we are about to use Protegrity
  • output tokenized data into BigQuery, BigTable, or GCS
  • supported formats: CSV and JSON

More details and the proposed design are available in the design doc.

ilya-kozyrev avatar Jan 14 '21 12:01 ilya-kozyrev

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Jan 14 '21 12:01 google-cla[bot]

@googlebot I consent.

KhaninArtur avatar Jan 14 '21 12:01 KhaninArtur

@googlebot I consent.

ramazan-yapparov avatar Jan 14 '21 12:01 ramazan-yapparov

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Jan 14 '21 12:01 google-cla[bot]

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Jan 14 '21 12:01 google-cla[bot]

@googlebot I consent

daria-malkova avatar Jan 14 '21 12:01 daria-malkova

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Jan 14 '21 12:01 google-cla[bot]

@googlebot I fixed it

daria-malkova avatar Jan 14 '21 13:01 daria-malkova

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Feb 05 '21 12:02 google-cla[bot]

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Feb 08 '21 00:02 google-cla[bot]

@googlebot I consent.

AKosolapov avatar Feb 08 '21 00:02 AKosolapov

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Mar 17 '21 20:03 google-cla[bot]

@googlebot I fixed it..

MikhailMedvedevAkvelon avatar Mar 18 '21 13:03 MikhailMedvedevAkvelon

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Mar 18 '21 13:03 google-cla[bot]

@googlebot I consent.

MikhailMedvedevAkvelon avatar Mar 18 '21 13:03 MikhailMedvedevAkvelon

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Mar 26 '21 06:03 google-cla[bot]

@googlebot I consent.

Nuzhdina-Elena avatar Mar 26 '21 07:03 Nuzhdina-Elena

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Apr 02 '21 08:04 google-cla[bot]

@googlebot I fixed it.

RaphaelSanamyan avatar Apr 07 '21 09:04 RaphaelSanamyan

Hi, @prathapreddy123 Did you have a chance to look at this PR?

We merged similar PR with the e2e example in the Beam repository.

ilya-kozyrev avatar Apr 13 '21 13:04 ilya-kozyrev

Hi, @prathapreddy123 Did you have a chance to look at this PR?

We merged similar PR with the e2e example in the Beam repository.

Hi @ilya-kozyrev - Not yet. Will check in the current week

prathapreddy123 avatar Apr 13 '21 17:04 prathapreddy123

Thanks for Contributing. I believe several classes (if carefully designed) can be reused by promoting to common module across the templates in future. Included few ideas for reference.

Thank you for your review. That's an awesome idea to make io classes more generic and promote them to the common module. I think it will be a good next step.

To avoid increasing PR size and reduce the effort to review, maybe make sense to use internal io classes in this template as is? After this PR will be merged, we'll create separate PRs for each transform to promote them into the common module based on your suggestions. When all transforms will be implemented and merged, we will create another PR to refactor the current template.

What do you think?

ilya-kozyrev avatar Apr 21 '21 20:04 ilya-kozyrev

Thanks for Contributing. I believe several classes (if carefully designed) can be reused by promoting to common module across the templates in future. Included few ideas for reference.

Thank you for your review. That's an awesome idea to make io classes more generic and promote them to the common module. I think it will be a good next step.

To avoid increasing PR size and reduce the effort to review, maybe make sense to use internal io classes in this template as is? After this PR will be merged, we'll create separate PRs for each transform to promote them into the common module based on your suggestions. When all transforms will be implemented and merged, we will create another PR to refactor the current template.

What do you think?

Sure we can consider that approach. But considering this pipeline has already been contributed to Beam repo, instead of replicating the same we can demonstrate patterns through building reusable classes and reduce effort considerably for new templates.

prathapreddy123 avatar Apr 23 '21 02:04 prathapreddy123

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar May 11 '21 11:05 google-cla[bot]

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar May 11 '21 12:05 google-cla[bot]

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar May 11 '21 13:05 google-cla[bot]

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar May 11 '21 13:05 google-cla[bot]

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar May 11 '21 13:05 google-cla[bot]

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar May 11 '21 13:05 google-cla[bot]

@googlebot I consent.

RaphaelSanamyan avatar May 11 '21 13:05 RaphaelSanamyan