recognize Classify HEIC/HEIF images

Describe the bug Recognize does not ahem recognize HEIC/HEIC files as images.

To Reproduce Steps to reproduce the behavior:

Run classifier.
Find HEIC/HEIF file in files.
See no tags applied.

Expected behavior HEIC/HEIF images should be tagged, just as if they were JPEG/PNG/BMP/etc.

Recognize (please complete the following information):

JS-only mode: Yes
Enabled modes: face recognition, object recognition, landmark recognition

Server (please complete the following information):

Nextcloud: 22.2.8
OS: Docker Alpine
RAM: 32 GB
Processor Architecture: x86_64

Additional context This line https://github.com/marcelklehr/recognize/blob/dc02e60ab7a24e0ddb7b484ac1e287cc181a1ccd/lib/Service/ImagesFinderService.php#L16 sets the list of formats. ~Is adding 'image/heic' sufficient?~ Or does some conversion need to be done before calling the classifier nets?

According to https://github.com/oliver-moran/jimp/issues/771, Jimp doesn't currently support HEIC/HEIF. A call to ImageMagick may be needed to do the conversion.

Jun 17 '22 02:06 kc9jud

https://github.com/oliver-moran/jimp/issues/771#issuecomment-828363187 suggests that heic-convert or heic-decode are alternatives to calling ImageMagick.

Jun 17 '22 02:06 kc9jud

I sure hope this is something we can see being added in a future update, most phones support it at this time and some even have it on by default like mine! I actually forgot about it and have a lot of unrecognized pictures now :(

Jul 01 '22 01:07 StarSmasher44

I would love to get heic support as well. I understand that I can convert my heic pictures to jpeg - but I prefer to keep them in their original format.

Also, thanks for a great app!

Aug 04 '22 10:08 b3nis

Even a temporary fix would be good for now I would say. Something like converting the files to jpeg in RAM, analyzing that and freeing them in the end.

Aug 09 '22 17:08 afonsofrancof

@marcelklehr I really need this since I exclusively use HEIC, so I'm willing to implement it and send a PR. Do any of these solutions sound good to you?

Use ImageMagick to convert the image to Jpeg first, then pass it to node.
Use heic-convert to do this in node.
Convert and downscale all images to a max resolution before processing, using ImageMagick (not sure if this is already done somewhere). Downscaling the image to a reasonable size might speed up inference.

Oct 09 '22 16:10 pulsejet

@pulsejet using ImageMagick sounds like a good idea. Point 3 could also solve https://github.com/nextcloud/recognize/issues/111 Maybe we can use already generated images by the image preview generator app?

Oct 10 '22 08:10 marcelklehr

I would be a bit wary of using pre generated images since we've no control over them (e.g. the max size can be configured by the user, which might not be enough for us). Plus it introduces a weird dependency and the conversion isn't too much effort compared to running the actual inference.

It will probably solve the OOM issue. Let me take another look and send a PR for this.

EDIT: looks like I'll use the preview generator after all, but not the pre-generated images

Oct 10 '22 15:10 pulsejet

My heic photos have worked since installing and configuring the imagick stuff, can't confirm if that is just previews or not..

As a hacky solution, it would probably indeed be easier to take heic > convert to jpeg > show (and perhaps cache for a short while) heic as jpg, leaving all else intact and untouched.

Oct 10 '22 15:10 nssatlantis

As a hacky solution, it would probably indeed be easier to take heic > convert to jpeg > show (and perhaps cache for a short while) heic as jpg, leaving all else intact and untouched.

That's not hacky. We need to ultimately convert every image to bitmap for inference anyway; this is just the pipeline.

Oct 10 '22 15:10 pulsejet

@pulsejet using ImageMagick sounds like a good idea. Point 3 could also solve #111 Maybe we can use already generated images by the image preview generator app?

Would it be possible to follow the preview generation in the choice of the conversion tool? That would mean no additional dependencies and the code could probably be reused. Previews use either imagick or, supported recently, imaginary (compare https://github.com/nextcloud/all-in-one/pull/1026)

imagick has always been plagued by security issues, which is why NC is providing an alternative now and it would be great if recognize did the same.

Jan 22 '23 22:01 theCalcaholic

I don't think it uses imagick anymore #479

Jan 22 '23 23:01 pulsejet

@pulsejet Oh, that's good to hear. Then it's even more important to not reintroduce it here, imo :)

Jan 23 '23 06:01 theCalcaholic

This issue is already solved...

Jan 23 '23 06:01 pulsejet

@pulsejet After looking at the PR you linked, I have to contradict: They are using the NC preview provider, which in turn can use imagick, depending on it's configuration. That's actually the interface that abstracts from a specific preview provider (e.g. imaginary or imagick)

Jan 23 '23 06:01 theCalcaholic

This issue is already solved...

Oh, I see...

How was it solved, though? With or without imagick and in the latter case, did it introduce imagick as a hard dependency for recognize? Because since it is basically impossible to manage a nextcloud server that uses imagick without RCE vulnerabilities, that would be problematic.

Jan 23 '23 06:01 theCalcaholic

AFAIK recognize just uses the preview generator for clarssification now, so it can classify any image that can have preview generated with an appropriate provider. If imaginary is installed, it'll use that. So as such, any further dependency on imagick doesn't belong to this repo (but server).

Jan 23 '23 06:01 pulsejet

@theCalcaholic Sorry, I never came back and closed this issue. (I had not upgraded to NC25 for a while so I hadn't seen the new features until I got recognize 3.) It should have been closed back when #365 was merged in October, and any dependence on ImageMagick introduced in #365 was removed in #479 in November.

Jan 23 '23 06:01 kc9jud

@pulsejet Thanks again for implementing this!

Jan 23 '23 06:01 kc9jud

@pulsejet Thanks again for implementing this!

@marcelklehr for most of it 😄

Jan 23 '23 06:01 pulsejet

@theCalcaholic Sorry, I never came back and closed this issue. (I had not upgraded to NC25 for a while so I hadn't seen the new features until I got recognize 3.) It should have been closed back when #365 was merged in October, and any dependence on ImageMagick introduced in #365 was removed in #479 in November.

@pulsejet and @kc9jud Thank you for clarifying that :) That sounds like a very good solution. :+1:

Jan 23 '23 06:01 theCalcaholic

One further clarification: If I understand correctly, it's still possible that if you configure NC to use ImageMagick for preview generation, Recognize will indirectly call ImageMagick. Of course, we don't have any control over that here -- that's all upstream of us.

Jan 23 '23 06:01 kc9jud