Use an IA API as an option to automatically describe embedded images on message

Open Askthedust84 opened this issue 11 months ago • 2 comments

Pitch

Someone mentionned this one that an mastodon web client already uses (https://mstdn.fr/@ButterflyOfFire/113933318525233468) :

https://github.com/cheeaun/img-alt-api

Motivation

Many people forget to describe the images due to lack of knowledge, time or simply willpower. This would be of great benefit to the blind on the fedivers.

Feb 02 '25 09:02 Askthedust84

I'm very much against anything that automatically captions an image. Only the author knows what is important context. And all the problems LLMs have.

Maybe we should look into extracting text shown in an image like the Mastodon web interface does. We already had a go at it once: https://github.com/tuskyapp/Tusky/pull/1592

Mar 10 '25 21:03 connyduck

OCR and similar text extraction — yes please.

"AI" — no thanks.

One helps the author fill in text that's already in the image; the other is just an (error prone) automated guesstimate that will not be of any use to the blind.

Mar 24 '25 09:03 haverholm