connectors icon indicating copy to clipboard operation
connectors copied to clipboard

[SPO] harden error handling for single-document issues

Open seanstory opened this issue 1 year ago • 3 comments

Problem Description

SPO connector will fail if a single request errors.

Screenshot 2023-09-05 at 12 26 09 PM

We should:

  • [ ] make sure that this specific error does not occur
  • [ ] make the whole connector more flexible and resilient to these types of errors

seanstory avatar Sep 05 '23 17:09 seanstory

Example of another error we should be able to retry or move past:

Received 400 response from https://graph.microsoft.com/v1.0/sites/<site_id>/lists/<list_id>/items?$select=createdDateTime,id,lastModifiedDateTime,weburl,createdBy,lastModifiedBy,contentType&$expand=fields($select=Title,Link,Attachments,LinkTitle,LinkFilename,Description,Conversation)
full stack trace
[FMWK][12:44:01][WARNING] [Sync Job id: DjoWlIoB26kkwmCbnNr-, connector id: DToVlIoB26kkwmCb0dpW, index name: search-retail] Received 400 response from https://graph.microsoft.com/v1.0/sites/<site_id>/lists/<list_id>/items?$select=createdDateTime,id,lastModifiedDateTime,weburl,createdBy,lastModifiedBy,contentType&$expand=fields($select=Title,Link,Attachments,LinkTitle,LinkFilename,Description,Conversation)
[FMWK][12:44:01][CRITICAL] [Sync Job id: DjoWlIoB26kkwmCbnNr-, connector id: DToVlIoB26kkwmCb0dpW, index name: search-retail] The document fetcher failed
Traceback (most recent call last):
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 402, in _get
    async with self._http_session.get(
  File "/path/to/connectors-python/lib/python3.10/site-packages/aiohttp/client.py", line 1141, in __aenter__
    self._resp = await self._coro
  File "/path/to/connectors-python/lib/python3.10/site-packages/aiohttp/client.py", line 643, in _request
    resp.raise_for_status()
  File "/path/to/connectors-python/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 1005, in raise_for_status
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 400, message='Bad Request', url=URL('https://graph.microsoft.com/v1.0/sites/<site_id>/lists/<list_id>/items?$select=createdDateTime,id,lastModifiedDateTime,weburl,createdBy,lastModifiedBy,contentType&$expand=fields($select=Title,Link,Attachments,LinkTitle,LinkFilename,Description,Conversation)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path/to/connectors-python/connectors/es/sink.py", line 387, in get_docs
    async for count, doc in aenumerate(generator):
  File "/path/to/connectors-python/connectors/utils.py", line 689, in aenumerate
    async for elem in asequence:
  File "/path/to/connectors-python/connectors/logger.py", line 134, in __anext__
    return await self.gen.__anext__()
  File "/path/to/connectors-python/connectors/es/sink.py", line 360, in _decorate_with_metrics_span
    async for doc in generator:
  File "/path/to/connectors-python/connectors/sync_job_runner.py", line 310, in prepare_docs
    async for doc, lazy_download, operation in self.generator():
  File "/path/to/connectors-python/connectors/sync_job_runner.py", line 342, in generator
    async for doc, lazy_download in self.data_provider.get_docs(
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 1547, in get_docs
    async for list_item, download_func in self.site_list_items(
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 1777, in site_list_items
    async for list_item in self.client.site_list_items(site_id, site_list_id):
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 745, in site_list_items
    async for page in self._graph_api_client.scroll(
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 347, in scroll
    graph_data = await self._get_json(scroll_url)
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 371, in _get_json
    async with self._get(absolute_url) as resp:
  File "/Users/gustavollermalylarrain/miniforge3/envs/connector/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 295, in wrapped
    async for item in func(*args, **kwargs):
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 413, in _get
    await self._handle_client_response_error(absolute_url, e)
  File "/path/to/connectors-python/connectors/sources/sharepoint_online.py", line 443, in _handle_client_response_error
    raise BadRequestError from e
connectors.sources.sharepoint_online.BadRequestError

Why should we not consider this a real 400? Because it's really not. SPO lies.

Screenshot 2023-09-14 at 4 56 43 PM

seanstory avatar Sep 15 '23 13:09 seanstory

There's a draft PR here: https://github.com/elastic/connectors-python/pull/1584, but it's no where near ready, and there are other priorities right now. I'm going to un-assign myself, and remove it from the current sprint until this can be prioritized.

The urgent piece has been fixed.

seanstory avatar Sep 19 '23 19:09 seanstory

another single-document failure issue: https://github.com/elastic/enterprise-search-team/issues/7044

seanstory avatar Mar 18 '24 18:03 seanstory