Introduce Amazon Comprehend Service
Added Amazon Comprehend Start Pii Entities Detection Job Operator Doc, Hook, Operator, Sensor, Trigger, Waiter, Unit Test, System Test.
At present it supports only Pii Entities Detection Job. Remaining Comprehend services coming next.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend/client/start_pii_entities_detection_job.html
Sample Dag:
from datetime import datetime
from airflow import DAG
from airflow.providers.amazon.aws.operators.comprehend import ComprehendStartPiiEntitiesDetectionJobOperator
with DAG(
dag_id="comprehend_testing",
schedule_interval=None,
start_date=datetime(2021, 1, 1),
tags=["comprehend pii entities detection"],
catchup=False,
) as dag:
pii_entities_detection_job = ComprehendStartPiiEntitiesDetectionJobOperator(
task_id="pii_entities_detection_job",
input_data_config={"S3Uri": f"s3://aws-comprehend-testing-hpl7cy/sample_data.txt",
"InputFormat": "ONE_DOC_PER_LINE",
},
output_data_config={"S3Uri": f"s3://aws-comprehend-testing-hpl7cy/redacted_output/"},
mode="ONLY_REDACTION",
language_code="en",
data_access_role_arn="arn:aws:iam::{ACCOUNT_ID}:role/ComprehendRole",
start_pii_entities_kwargs={"RedactionConfig": {"PiiEntityTypes": ["NAME", "ADDRESS"],
"MaskMode": "REPLACE_WITH_PII_ENTITY_TYPE"}}
)
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.
This is an awesome PR! Super thorough and ticks all the boxes. We'll use this as an example for future folks, great work! 😃
Thank you so much for reviewing this 😄 , Applied all your feedback. The quick start guides are really helpful and well documented.
Awesome PR!
Thank you @vincbeck 😃