connectors
connectors copied to clipboard
"The job has not seen any update for some time." error is not helpful
We need some better error handling in our connectors.
For example, the "The job has not seen any update for some time." is not helpful.
"created_at": "2024-03-18T18:39:04.380Z",
"deleted_document_count": 0,
"error": "The job has not seen any update for some time.",
"indexed_document_count": 16521,
"indexed_document_volume": 173,
"job_type": "full",
"last_seen": "2024-03-18T18:59:06.729398+00:00",
"metadata": {},
"started_at": "2024-03-18T18:39:23.061424+00:00",
"status": "error",
"total_document_count": null,
"trigger_method": "on_demand",
Meanwhile, the connectors logs are also not that helpful in diagnosing why this is happening:
My hunch is that this is happening because of a SIGTERM on the Enterprise Search instance running the connector service.
If so, we should handle this type of "expected" (intentional restarts of the connector service) or "unexpected" (unplanned termination of the connector service) failure more gracefully and provide intuitive error handling/logging so the user will know that they have to now go and re-run the sync job.
I don't think this is an expected failure, so much as (what looks to me) an OOM or some other system crash/interrupt.
But yes, it is odd that after restart, we edit the job's error value to indicate it went idle, instead of setting the job error when we cancel the framework. @wangch079 any thoughts here?
From the logs in the screenshot, it looks like it's a graceful shutdown, and I expect the sync job would be set to suspended
status.
It's eventually set to error
with The job has not seen any update for some time.
then I believe the sync job was not suspended successfully.
We also have a bug reported that jobs are not suspended any more on graceful shutdown: https://github.com/elastic/connectors/issues/2167
Fixing it can potentially get rid of this error fully