[Confluence] - Incorrect space name causes loop of traceback errors on indexing
User provided a non-existing space name. However the connector indexing status shows up as "Succeeded" (for 0 documents) instead of failing with the actual error message (No space with key: xxx)
I believe some validation prior to the creation of a connector should be required. (Ie. if indexing a space, ensure it exists first)
WARNING: 10/01/2024 03:11:02 PM connector.py 369: [CC Pair ID: 105] [Attempt ID: 109963] Batch failed with space LATAM TC at offset 0 with size 16, processing pages individually...
ERROR: 10/01/2024 03:11:02 PM connector.py 448: [CC Pair ID: 105] [Attempt ID: 109963] Ran into exception when fetching pages from Confluence
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/atlassian/confluence.py", line 533, in get_all_pages_from_space_raw
response = self.get(url, params=params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/atlassian/rest_client.py", line 285, in get
response = self.request(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/atlassian/rest_client.py", line 257, in request
self.raise_for_status(response)
File "/usr/local/lib/python3.11/site-packages/atlassian/confluence.py", line 3091, in raise_for_status
raise HTTPError(error_msg, response=response)
requests.exceptions.HTTPError: No space with key : LATAM TC
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/danswer/connectors/confluence/connector.py", line 359, in _fetch_space
return get_all_pages_from_space(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/danswer/connectors/confluence/rate_limit_handler.py", line 33, in wrapped_call
return confluence_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/atlassian/confluence.py", line 570, in get_all_pages_from_space
return self.get_all_pages_from_space_raw(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/atlassian/confluence.py", line 536, in get_all_pages_from_space_raw
raise ApiPermissionError(
atlassian.errors.ApiPermissionError: The calling user does not have permission to view the content
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/atlassian/confluence.py", line 533, in get_all_pages_from_space_raw
response = self.get(url, params=params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/atlassian/rest_client.py", line 285, in get
response = self.request(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/atlassian/rest_client.py", line 257, in request
self.raise_for_status(response)
File "/usr/local/lib/python3.11/site-packages/atlassian/confluence.py", line 3091, in raise_for_status
raise HTTPError(error_msg, response=response)
requests.exceptions.HTTPError: No space with key : LATAM TC
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/danswer/connectors/confluence/connector.py", line 441, in _fetch_pages
_fetch_space(start_ind, self.batch_size)
File "/app/danswer/connectors/confluence/connector.py", line 380, in _fetch_space
get_all_pages_from_space(
File "/app/danswer/connectors/confluence/rate_limit_handler.py", line 33, in wrapped_call
return confluence_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/atlassian/confluence.py", line 570, in get_all_pages_from_space
return self.get_all_pages_from_space_raw(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/atlassian/confluence.py", line 536, in get_all_pages_from_space_raw
raise ApiPermissionError(
atlassian.errors.ApiPermissionError: The calling user does not have permission to view the content
Same behavior when the confluence bot doesn't have the permissions on the Space. In this case, the green light (Succeeded) is counter-intuitive. Maybe a yellow one to indicate a warning could be an improvement. A completed indexation without any document should not be considered normal.
Great idea– added this so to the roadmap! We plan on having some additional checks for this at creation soon.