airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Introduce Amazon Comprehend Service

Open gopidesupavan opened this issue 1 year ago • 1 comments

Added Amazon Comprehend Start Pii Entities Detection Job Operator Doc, Hook, Operator, Sensor, Trigger, Waiter, Unit Test, System Test.

At present it supports only Pii Entities Detection Job. Remaining Comprehend services coming next.

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend/client/start_pii_entities_detection_job.html

Sample Dag:

from datetime import datetime

from airflow import DAG
from airflow.providers.amazon.aws.operators.comprehend import ComprehendStartPiiEntitiesDetectionJobOperator

with DAG(
    dag_id="comprehend_testing",
    schedule_interval=None,
    start_date=datetime(2021, 1, 1),
    tags=["comprehend pii entities detection"],
    catchup=False,
) as dag:
    pii_entities_detection_job = ComprehendStartPiiEntitiesDetectionJobOperator(
        task_id="pii_entities_detection_job",
        input_data_config={"S3Uri": f"s3://aws-comprehend-testing-hpl7cy/sample_data.txt",
                           "InputFormat": "ONE_DOC_PER_LINE",
                           },
        output_data_config={"S3Uri": f"s3://aws-comprehend-testing-hpl7cy/redacted_output/"},
        mode="ONLY_REDACTION",
        language_code="en",
        data_access_role_arn="arn:aws:iam::{ACCOUNT_ID}:role/ComprehendRole",
        start_pii_entities_kwargs={"RedactionConfig": {"PiiEntityTypes": ["NAME", "ADDRESS"],
                                                       "MaskMode": "REPLACE_WITH_PII_ENTITY_TYPE"}}
    )
image image

^ Add meaningful description above Read the Pull Request Guidelines for more information. In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed. In case of a new dependency, check compliance with the ASF 3rd Party License Policy. In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

gopidesupavan avatar May 13 '24 19:05 gopidesupavan

This is an awesome PR! Super thorough and ticks all the boxes. We'll use this as an example for future folks, great work! 😃

Thank you so much for reviewing this 😄 , Applied all your feedback. The quick start guides are really helpful and well documented.

gopidesupavan avatar May 14 '24 05:05 gopidesupavan

Awesome PR!

Thank you @vincbeck 😃

gopidesupavan avatar May 15 '24 08:05 gopidesupavan