boto3 icon indicating copy to clipboard operation
boto3 copied to clipboard

[Kendra] sharepoint datasource (AZURE_AD Authentication) fails in data_source_sync_job

Open ssmails opened this issue 5 months ago • 7 comments

Describe the bug

[Kendra]

  • sharepoint datasource (AZURE_AD Authentication) create data source - am able to create the datasource using boto3.
  • datasource created above fails the sync, in boto3 data_source_sync_job(). call succeeds, but the sync fails without any errors.

Want to understand if the AZURE_AD authentication option is supported by below versions of boto3. boto3==1.35.16 botocore==1.35.16

The document states it supports only HTTP_BASIC | OAUTH2 per https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/create_data_source.html

'AuthenticationType': 'HTTP_BASIC'|'OAUTH2',

But, when I use the code, it states AZURE_AD also as a valid option.

Value at "configuration.sharePointConfiguration.authenticationType" failed to satisfy constraint: Member must satisfy enum value set: [AZURE_AD, HTTP_BASIC, OAUTH2]

If it is supported, why is the sync failing ? Below attached code snippets to reproduce the failure.

Expected Behavior

If AZURE_AD authentication is supported, the sync should succeed. Manually creating datasource and syncing from Kendra UI for the AZURE AD type authentication is working as expected.

Current Behavior

[Kendra]

  • sharepoint datasource (AZURE_AD Authentication) create data source - am able to create the datasource using boto3.
  • datasource created above fails the sync, in boto3 data_source_sync_job(). call succeeds, but the sync fails without any errors.

Reproduction Steps

    def create_new_data_source_sharepoint(self, index_id:str):
        """
        Creates a new Kendra data source based on the configuration provided in the YAML file.

        Returns:
            str: The ID of the created data source, or the ID of the existing data source if one already exists.
        """

        data_source_config = self.config['data_source']
        logger.info("config=", data_source_config)
        #logger.info("auth=", data_source_config['configuration']['SharePointConfiguration']['AuthenticationType'])

        try:
            # using hardcoded sharepoint config
            response = self.kendra.create_data_source(
                RoleArn='correct role are that works via Kendra UI',
                Name=data_source_config['name'],
                IndexId='prerecreated Kendra indexid',
                Type='SHAREPOINT',
                Configuration={
                    'SharePointConfiguration': {
                        'SharePointVersion': 'SHAREPOINT_ONLINE',
                        'Urls': [
                            'sharepoint site that works via Kendra UI'
                        ],
                        'AuthenticationType': 'AZURE_AD',
                        'SecretArn': 'correct are for azure ad , which works via Kendra UI',
                    }
                }
            )

            data_source_id = response['Id']
            logger.info(f"Data source created with ID: {data_source_id}")

            return data_source_id
        except Exception as e:
            logger.error(f"Error creating data source: {str(e)}")
            raise KendraAdapterException(f"Error creating data source: {str(e)}")

    def start_ingestion_sharepoint(self, index_id, data_source_id):
        try:
            while True:
                status = self.kendra.describe_index(Id=index_id)['Status']
                if status == 'ACTIVE':
                    logger.info(f"Index {index_id} is active")
                    break
                logger.info(f"Waiting for index {index_id} to become active. Current status: {status}")
                time.sleep(30)

            response = self.kendra.start_data_source_sync_job(
                Id=data_source_id,
                IndexId=index_id
            )

            sync_job_id = response['ExecutionId']
            logger.info(f"Data source sync job started with ID: {sync_job_id}")
        except Exception as e:
            logger.error(f"Error starting data source sync job: {str(e)}")
            raise KendraAdapterException(f"Error starting data source sync job: {str(e)}")

Possible Solution

No response

Additional Information/Context

No response

SDK version used

1.35.16

Environment details (OS name and version, etc.)

Mac OS

ssmails avatar Sep 11 '24 17:09 ssmails