boto3 icon indicating copy to clipboard operation
boto3 copied to clipboard

[Kendra] Jira connector not syncing

Open ssmails opened this issue 1 year ago • 11 comments

Describe the bug

Created JIRA connector following this documentation https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/create_data_source.html

datasource created, but the sync fails. No logs on kendra for the error. create and sync from kendra UI with same configs as the above failed case, works ok. so seem like boto3 kendra compatibility issue.

Regression Issue

  • [ ] Select this option if this issue appears to be a regression.

Expected Behavior

jira sync should work

Current Behavior

datasource created, but the sync fails. No logs on kendra for the error. create and sync from kendra UI with same configs as the above failed case, works ok. so seem like boto3 kendra compatibility issue.

Reproduction Steps

    def create_new_data_source_jira(self, index_id: str):
        """
        Creates a new Kendra data source based on the configuration provided in the YAML file.

        Returns:
        str: The ID of the created data source, or the ID of the existing data source if one already exists.
        """
        data_source_config = self.config['data_source']

        logger.info(f"role ARN={data_source_config['configuration']['role_arn']}")
        logger.info(f"indexid={data_source_config['indexid']}")

        try:
            response = self.kendra.create_data_source(
                RoleArn=data_source_config['configuration']['role_arn'],
                Name=data_source_config['name'],
                IndexId=data_source_config['indexid'],
                Type=data_source_config['type'],
                Configuration={
                    'JiraConfiguration': {
                        'JiraAccountUrl': 'working jira url which works from kendra ui',
                        'SecretArn': 'working secret art which works from kendra ui',
                        'IssueSubEntityFilter': ['COMMENTS','ATTACHMENTS','WORKLOGS'],
                        'IssueType': ['BUG','STORY','TASK','EPIC'],
                        'AttachmentFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'CommentFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'IssueFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'ProjectFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'WorkLogFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                    }
                }
            )

            data_source_id = response['Id']
            logger.info(f"Data source created with ID: {data_source_id}")

            return data_source_id
        except Exception as e:
            logger.error(f"Error creating data source: {str(e)}")
            raise KendraAdapterException(f"Error creating data source: {str(e)}")
    def start_ingestion_jira(self, index_id, data_source_id):
        try:
            response = self.kendra.start_data_source_sync_job(
                Id=data_source_id,
                IndexId=index_id
            )

            sync_job_id = response['ExecutionId']
            logger.info(f"Data source sync job started with ID: {sync_job_id}")
        except Exception as e:
            logger.error(f"Error starting data source sync job: {str(e)}")
            raise KendraAdapterException(f"Error starting data source sync job: {str(e)}")

Possible Solution

No response

Additional Information/Context

No response

SDK version used

1.35.16

Environment details (OS name and version, etc.)

Mac

ssmails avatar Sep 30 '24 21:09 ssmails

Thanks for reaching out. Can you provide more details regarding the sync failure? What error are you getting? The create_data_source command makes a request to the CreateDataSource API, so we'll need more information to investigate if there is some issue with the underlying API behavior.

Can you share a complete code snippet for reproducing the behavior, as well as debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script?

tim-finnigan avatar Oct 01 '24 22:10 tim-finnigan

@tim-finnigan Thanks, unfortunately, there seems to be no error in cloud watch for kendra for this sync failure.

It is simple to reproduce.

  1. use boto3 to create a jira connector with the following jira configuration.
                    'JiraConfiguration': {
                        'JiraAccountUrl': 'working jira url which works from kendra ui',
                        'SecretArn': 'working secret art which works from kendra ui',
                        'IssueSubEntityFilter': ['COMMENTS','ATTACHMENTS','WORKLOGS'],
                        'IssueType': ['BUG','STORY','TASK','EPIC'],
                        'AttachmentFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'CommentFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'IssueFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'ProjectFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'WorkLogFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                    }
  1. sync the datasource created - > the sync fails

ssmails avatar Oct 02 '24 16:10 ssmails

Can you share a complete code snippet for reproducing the behavior, as well as debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script? I'd like to see the specific API response that is being returned here.

tim-finnigan avatar Oct 02 '24 19:10 tim-finnigan

sharing code snippet for the sync failure + logs requested. @tim-finnigan

Note - Datasource was pre-created with 'JiraConfiguration' provided above, using the boto3 sdk. (That operation was successful)

jira-boto3-log-failure.txt

import time
import logging
import boto3
boto3.set_stream_logger('')
from botocore.client import ClientError

def jira_ingest():
    try:
        logging.debug("START.")

        kendra = boto3.client(
            'kendra'
        )
        logging.info("Kendra client initialized successfully.")
        
        response = kendra.start_data_source_sync_job(
            Id="mydatssourceid",
            IndexId="myindexid"
        )
        sync_job_id = response['ExecutionId']
        logging.debug(f"Data source sync job started with ID: {sync_job_id}")
        time.sleep(600)

    except Exception as e:
        logging.error(f"Error: {str(e)}")


if __name__ == '__main__':
    jira_ingest()

ssmails avatar Oct 02 '24 21:10 ssmails

Thanks for following up — from your logs I see that a sync job was started and there is not error or failure present. When you say the sync fails, could you share more details on the failure?

Here in the Kendra Developer Guide is a troubleshooting section on sync jobs failing: https://docs.aws.amazon.com/kendra/latest/dg/troubleshooting-data-sources.html#troubleshooting-data-sources-failed. Have you tried going through the steps documented there

tim-finnigan avatar Oct 07 '24 21:10 tim-finnigan

The sync shows failed in Kendra without any errors. I have consulted the developer guide already. Want to mention again that using the same role and policy and other configurations, when I try manually - it works form kendra UI. So, seems like an issue with boto3 library for jira. I have provide the code snippet to reproduce the issue as well, as requested earlier.

ssmails avatar Oct 08 '24 03:10 ssmails

@tim-finnigan appreciate your quick response on this. All requested information has been provided. Thanks.

ssmails avatar Oct 08 '24 16:10 ssmails

The sync shows failed in Kendra without any errors.

From your logs I see that the StartDataSourceSyncJob request succeeded. Can you provide more details on the failure? You can try running list_data_source_sync_jobs to get more info.

Also here are docs on setting up the Jira connector, have you followed those? https://docs.aws.amazon.com/kendra/latest/dg/data-source-jira.html

tim-finnigan avatar Oct 11 '24 20:10 tim-finnigan

fyi - we reached out to kendra team and here is their response @tim-finnigan Appreciate if the boto3 docs can be updated accordingly, so people are aware of the working connectors. Current boto3 docs , dont mention using the templataized connector at all. But the sample config is for the old mechanism. We are running into this for several kendra connectors that we are trying to use via boto3. Is there a list of working connectors with accurate examples that you can provide for boto3 - kendra ?

I hope you are doing well.

I am reaching out with the latest updates on this bug.

The issue is with the non-templatized Jira connector. We are planning on launching a templatized connector which will resolve the issue, but we are still awaiting the date to launch the templatized connector from our team.

Our internal team have tried debugging the issue but were unable to identify a better solution to fix this. I will update you once we have a date for the templatized connector to be launched.

ssmails avatar Oct 14 '24 16:10 ssmails

Thanks for following up. I found the internal tracking item for this issue. Regarding the "non-templatized Jira connector" issue, that is something that the Kendra team would need to address. They should also be able to provide guidance for your use case.

There are not currently Kendra examples for boto3 in the documentation or code examples repository. The Kendra developer guide has a getting started guide but no specific Python SDK examples I could find for data source connectors. If there's a specific connector that you want to try other than the Jira one then let me know and I can try to look into it. But I recommend that you continue your correspondence on the source case with any additional questions to receive further guidance from the appropriate team.

tim-finnigan avatar Oct 14 '24 16:10 tim-finnigan

@tim-finnigan , Kendra team was able to provide us a waoraround. If there is a place where you have examples/docs, would be happy to contribute.

ssmails avatar Oct 18 '24 17:10 ssmails

Followed up here in your other issue regarding where examples could be added. The Kendra team followed up on the internal ticket and noted that a fix for the Jira connector was deployed.

tim-finnigan avatar Nov 04 '24 18:11 tim-finnigan

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.

github-actions[bot] avatar Nov 04 '24 18:11 github-actions[bot]