cvat icon indicating copy to clipboard operation
cvat copied to clipboard

500 Error on Task Status Endpoint Due to Missing Job Metadata in Redis

Open mjohenneken opened this issue 1 year ago • 2 comments

Actions before raising this issue

  • [x] I searched the existing issues and did not find anything similar.
  • [x] I read/searched the docs

Steps to Reproduce

  1. Install CVAT v2.32.0 using Docker Swarm
  2. Ensure Redis and all worker services are running, but note that the worker_chunks container might not be properly configured
  3. Create a task with multiple jobs
  4. Try accessing the task status endpoint via /api/tasks/{id}/status directly or through an integration like FiftyOne
  5. Observe a 500 Internal Server Error
  6. Check the server logs to see an AttributeError related to missing job metadata in Redis

Expected Behavior

The task status endpoint should return a valid response with information about the task's progress and state, even when job metadata might be missing in Redis.

Possible Solution

The issue appears to be in the _get_rq_response method that doesn't properly handle cases where job metadata is missing in Redis. A fix would involve adding error handling similar to what was done in commit 2f110e5 for a similar issue in the /api/requests endpoint.

This would gracefully handle cases where job metadata is missing without causing a 500 error.

Context

The issue was discovered while attempting to integrate CVAT with FiftyOne for annotation visualization. When FiftyOne tries to retrieve task status during annotation import, it encounters HTTP 500 errors from the status endpoint. The problem exists in both old tasks (migrated from v2.22.0) and new tasks created in v2.32.0. Adding the worker_chunks service helped with some aspects, but the core issue remains with how the code handles missing Redis metadata. In Docker Swarm deployment, the issue is more pronounced because of how the services are configured. The problem might be related to recent changes in rq.py which commented about making job metadata nullable.

Environment

Git hash commit: Using tagged version v2.32.0
Docker version: Docker 20.10.21
Using Docker Swarm
Operating System: Linux, Ubuntu 20.04

<details>
<summary>Stack Trace of the HTTP 500</summary>
ERROR django.request: Internal Server Error: /api/tasks/67/status
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/asgiref/sync.py", line 518, in thread_handler
    raise exc_info[1]
  File "/opt/venv/lib/python3.10/site-packages/django/core/handlers/exception.py", line 42, in inner
    response = await get_response(request)
  File "/opt/venv/lib/python3.10/site-packages/django/core/handlers/base.py", line 253, in _get_response_async
    response = await wrapped_callback(
  File "/opt/venv/lib/python3.10/site-packages/asgiref/sync.py", line 468, in __call__
    ret = await asyncio.shield(exec_coro)
  File "/opt/venv/lib/python3.10/site-packages/asgiref/current_thread_executor.py", line 40, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/venv/lib/python3.10/site-packages/asgiref/sync.py", line 522, in thread_handler
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/django/views/decorators/csrf.py", line 56, in wrapper_view
    return view_func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/viewsets.py", line 124, in view
    return self.dispatch(request, *args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/views.py", line 509, in dispatch
    response = self.handle_exception(exc)
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/views.py", line 469, in handle_exception
    self.raise_uncaught_exception(exc)
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
    raise exc
  File "/opt/venv/lib/python3.10/site-packages/rest_framework/views.py", line 506, in dispatch
    response = handler(request, *args, **kwargs)
  File "/home/django/cvat/apps/engine/views.py", line 1593, in status
    response = self._get_rq_response(
  File "/home/django/cvat/apps/engine/views.py", line 1607, in _get_rq_response
    rq_job_meta = ImportRQMeta.for_job(job)
  File "/home/django/cvat/apps/engine/rq.py", line 148, in for_job
    return cls(job=job, meta=job.meta)
AttributeError: 'NoneType' object has no attribute 'meta'
</details>

mjohenneken avatar Apr 14 '25 21:04 mjohenneken

Possibly related #9076 So there are plans to drop that endpoint? In that case the FiftyOne CVAT Intergration might need and update. Its not clear to me which versions are compatible

mjohenneken avatar Apr 14 '25 21:04 mjohenneken

Further testing revealed that this happens only in 2.31 and 2.32. In 2.30 it works

mjohenneken avatar Apr 14 '25 22:04 mjohenneken