[Kendra] Jira connector not syncing
Describe the bug
Created JIRA connector following this documentation https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/create_data_source.html
datasource created, but the sync fails. No logs on kendra for the error. create and sync from kendra UI with same configs as the above failed case, works ok. so seem like boto3 kendra compatibility issue.
Regression Issue
- [ ] Select this option if this issue appears to be a regression.
Expected Behavior
jira sync should work
Current Behavior
datasource created, but the sync fails. No logs on kendra for the error. create and sync from kendra UI with same configs as the above failed case, works ok. so seem like boto3 kendra compatibility issue.
Reproduction Steps
def create_new_data_source_jira(self, index_id: str):
"""
Creates a new Kendra data source based on the configuration provided in the YAML file.
Returns:
str: The ID of the created data source, or the ID of the existing data source if one already exists.
"""
data_source_config = self.config['data_source']
logger.info(f"role ARN={data_source_config['configuration']['role_arn']}")
logger.info(f"indexid={data_source_config['indexid']}")
try:
response = self.kendra.create_data_source(
RoleArn=data_source_config['configuration']['role_arn'],
Name=data_source_config['name'],
IndexId=data_source_config['indexid'],
Type=data_source_config['type'],
Configuration={
'JiraConfiguration': {
'JiraAccountUrl': 'working jira url which works from kendra ui',
'SecretArn': 'working secret art which works from kendra ui',
'IssueSubEntityFilter': ['COMMENTS','ATTACHMENTS','WORKLOGS'],
'IssueType': ['BUG','STORY','TASK','EPIC'],
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
'CommentFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
'IssueFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
'ProjectFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
'WorkLogFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
}
}
)
data_source_id = response['Id']
logger.info(f"Data source created with ID: {data_source_id}")
return data_source_id
except Exception as e:
logger.error(f"Error creating data source: {str(e)}")
raise KendraAdapterException(f"Error creating data source: {str(e)}")
def start_ingestion_jira(self, index_id, data_source_id):
try:
response = self.kendra.start_data_source_sync_job(
Id=data_source_id,
IndexId=index_id
)
sync_job_id = response['ExecutionId']
logger.info(f"Data source sync job started with ID: {sync_job_id}")
except Exception as e:
logger.error(f"Error starting data source sync job: {str(e)}")
raise KendraAdapterException(f"Error starting data source sync job: {str(e)}")
Possible Solution
No response
Additional Information/Context
No response
SDK version used
1.35.16
Environment details (OS name and version, etc.)
Mac
Thanks for reaching out. Can you provide more details regarding the sync failure? What error are you getting? The create_data_source command makes a request to the CreateDataSource API, so we'll need more information to investigate if there is some issue with the underlying API behavior.
Can you share a complete code snippet for reproducing the behavior, as well as debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script?
@tim-finnigan Thanks, unfortunately, there seems to be no error in cloud watch for kendra for this sync failure.
It is simple to reproduce.
- use boto3 to create a jira connector with the following jira configuration.
'JiraConfiguration': {
'JiraAccountUrl': 'working jira url which works from kendra ui',
'SecretArn': 'working secret art which works from kendra ui',
'IssueSubEntityFilter': ['COMMENTS','ATTACHMENTS','WORKLOGS'],
'IssueType': ['BUG','STORY','TASK','EPIC'],
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
'CommentFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
'IssueFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
'ProjectFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
'WorkLogFieldMappings': [
{
'DataSourceFieldName': 'url',
'IndexFieldName': '_source_uri'
},
],
}
- sync the datasource created - > the sync fails
Can you share a complete code snippet for reproducing the behavior, as well as debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script? I'd like to see the specific API response that is being returned here.
sharing code snippet for the sync failure + logs requested. @tim-finnigan
Note - Datasource was pre-created with 'JiraConfiguration' provided above, using the boto3 sdk. (That operation was successful)
import time
import logging
import boto3
boto3.set_stream_logger('')
from botocore.client import ClientError
def jira_ingest():
try:
logging.debug("START.")
kendra = boto3.client(
'kendra'
)
logging.info("Kendra client initialized successfully.")
response = kendra.start_data_source_sync_job(
Id="mydatssourceid",
IndexId="myindexid"
)
sync_job_id = response['ExecutionId']
logging.debug(f"Data source sync job started with ID: {sync_job_id}")
time.sleep(600)
except Exception as e:
logging.error(f"Error: {str(e)}")
if __name__ == '__main__':
jira_ingest()
Thanks for following up — from your logs I see that a sync job was started and there is not error or failure present. When you say the sync fails, could you share more details on the failure?
Here in the Kendra Developer Guide is a troubleshooting section on sync jobs failing: https://docs.aws.amazon.com/kendra/latest/dg/troubleshooting-data-sources.html#troubleshooting-data-sources-failed. Have you tried going through the steps documented there
The sync shows failed in Kendra without any errors. I have consulted the developer guide already. Want to mention again that using the same role and policy and other configurations, when I try manually - it works form kendra UI. So, seems like an issue with boto3 library for jira. I have provide the code snippet to reproduce the issue as well, as requested earlier.
@tim-finnigan appreciate your quick response on this. All requested information has been provided. Thanks.
The sync shows failed in Kendra without any errors.
From your logs I see that the StartDataSourceSyncJob request succeeded. Can you provide more details on the failure? You can try running list_data_source_sync_jobs to get more info.
Also here are docs on setting up the Jira connector, have you followed those? https://docs.aws.amazon.com/kendra/latest/dg/data-source-jira.html
fyi - we reached out to kendra team and here is their response @tim-finnigan Appreciate if the boto3 docs can be updated accordingly, so people are aware of the working connectors. Current boto3 docs , dont mention using the templataized connector at all. But the sample config is for the old mechanism. We are running into this for several kendra connectors that we are trying to use via boto3. Is there a list of working connectors with accurate examples that you can provide for boto3 - kendra ?
I hope you are doing well.
I am reaching out with the latest updates on this bug.
The issue is with the non-templatized Jira connector. We are planning on launching a templatized connector which will resolve the issue, but we are still awaiting the date to launch the templatized connector from our team.
Our internal team have tried debugging the issue but were unable to identify a better solution to fix this. I will update you once we have a date for the templatized connector to be launched.
Thanks for following up. I found the internal tracking item for this issue. Regarding the "non-templatized Jira connector" issue, that is something that the Kendra team would need to address. They should also be able to provide guidance for your use case.
There are not currently Kendra examples for boto3 in the documentation or code examples repository. The Kendra developer guide has a getting started guide but no specific Python SDK examples I could find for data source connectors. If there's a specific connector that you want to try other than the Jira one then let me know and I can try to look into it. But I recommend that you continue your correspondence on the source case with any additional questions to receive further guidance from the appropriate team.
@tim-finnigan , Kendra team was able to provide us a waoraround. If there is a place where you have examples/docs, would be happy to contribute.
Followed up here in your other issue regarding where examples could be added. The Kendra team followed up on the internal ticket and noted that a fix for the Jira connector was deployed.
This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.