cvat icon indicating copy to clipboard operation
cvat copied to clipboard

Attach data to a task: better MIME type detection

Open deltheil opened this issue 5 months ago • 2 comments

Actions before raising this issue

  • [X] I searched the existing issues and did not find anything similar.
  • [X] I read/searched the docs

Is your feature request related to a problem? Please describe.

Context

I am uploading image files via https://app.cvat.ai/api/docs/#tag/tasks/operation/tasks_create_data (using the client_files parameters).

In my case, my image files are stored on disk in a content-addressable manner mimicking how git store and name files. E.g. typically, a JPEG file could be stored as /var/misc/images/1f/ec4f5cee029f96c1e9eddd09821a51c0a9f80a.

Problem

The problem is related to the CVAT engine MIME type detection which is based on file extensions:

  • https://github.com/cvat-ai/cvat/blob/f93d58c1ca9401daeee5beba5d5f79ace975c02b/cvat/apps/engine/task.py#L215-L231
  • https://github.com/cvat-ai/cvat/blob/f93d58c1ca9401daeee5beba5d5f79ace975c02b/cvat/apps/engine/media_extractors.py#L859-L863

E.g. is_image builds upon https://docs.python.org/3/library/mimetypes.html#mimetypes.guess_type:

def _is_image(path):
    mime = mimetypes.guess_type(path)
    # Exclude vector graphic images because Pillow cannot work with them
    return mime[0] is not None and mime[0].startswith('image') and \
        not mime[0].startswith('image/svg')

tl;dr

In my case, all the uploaded image files get ignored.

Describe the solution you'd like

I think it would be great if MIME type detection could be expanded to support magic detection (file headers), e.g. using https://github.com/ahupp/python-magic or anything equivalent. In other words, do not get limited to file extension based detection (.jpg, etc).

NB.: I am talking about images, but same could be done for other media types of course.

Describe alternatives you've considered

I am forced to rename (add an extension) at upload time (work around).

Additional context

No response

deltheil avatar Aug 26 '24 09:08 deltheil