IPED Enable transcription by default in some circumstances

This issue is more a suggestion than an issue.

The transcription is not enable by default. But in some circumstances it could be enable or disabled. I see two cases.

The transcription could be enabled by default when processing audio files attached to chats
The transcription could be ignored by default when the audio file is known in hashes database.

These behaviors could be change using some property like

# Values: all, unknown, chats, none
itemsToTranscript = chats

Or using separated properties

enableTranscriptionOnlyForChats = true
disableTranscriptionForKnownFiles = false

Just ideas....

Jun 23 '22 02:06 aberenguel

Good idea. Maybe the transcription can be executed on audio files that returns as a result of a configurable query. So the configuration parameter could contain this query.

Em qua., 22 de jun. de 2022 22:57, André Berenguel @.***> escreveu:

This issue is more a suggestion than an issue.

The transcription is not enable by default. But in some circumstances it could be enable or disabled. I see two cases.

The transcription could be enabled by default when processing audio files attached to chats

The transcription could be ignored by default when the audio file is known in hashes database.

These behaviors could be change using some property like

Values: all, unknown, chats, none

itemsToTranscript = chats

Or using separated properties

enableTranscriptionOnlyForChats = true disableTranscriptionForKnownFiles = false

Just ideas....

— Reply to this email directly, view it on GitHub https://github.com/sepinf-inc/IPED/issues/1183, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG247SY42L53DLHK2NGSBULVQPG73ANCNFSM5ZSWHQ2Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Jun 23 '22 03:06 patrickdalla

Hi @aberenguel.

The transcription could be enabled by default when processing audio files attached to chats

This was already proposed informally by other users, thanks for opening this discussion. The chalenge is, when an audio file is processed, we don't know if it is attached to some chat. Just when the chat is processed in a 2nd or 3rd processing queue, we discover attached audios, but then they were already processed/indexed. We may re-process them at this point (thanks to #1062, updating indexed content), maybe using a thread pool like in the new download WhatsApp attachtments feature to don't run single threaded, this would need some interface to update items in the index (it does not exist right now, but I'm planning to add one, I'll open a separate issue) but processing pipeline may become a bit more complicated, I'm not sure how to call transcription module from parsing module right now (eventually from a separate parsing process, but this doesn't happen today for items being processed in queues > 0, like chats), this idea should be enhanced... Maybe the desired #24 could help, running different processing pipelines one after the other, but that will require a resonable effort...

2. The transcription could be ignored by default when the audio file is known in hashes database.

This is easy to be implemented and is already done for photoDNA.

Maybe the transcription can be executed on audio files that returns as a result of a configurable query.

This could be part of the feature configuration, but the chalenges I pointed at first still stand. I'm open for ideas...

Jun 23 '22 03:06 lfcnassif

A simple workaround to recorded audios of some chat apps, like WhatsApp, that I already suggested to 2 users in the past, is to define new audio mimetypes, children of the original ones (opus, etc) , based on filename patterns and change AudioTranscripConfig.txt to run just on those new mime types, that should work for some cases.

Jun 23 '22 12:06 lfcnassif

Would above be enough? What recorded audio name patterns and extensions are more common out there?

Jun 23 '22 12:06 lfcnassif

Would above be enough? What recorded audio name patterns and extensions are more common out there?

I think if we can choose file name patterns and mime types for transcription it would be ok. Combined with some disableTranscriptionForKnownFiles property would be even better..

As you mentioned, process files attached to chats would be complicated and a little bit risky to implemented in the next 4.0.0.release.

Jun 30 '22 02:06 aberenguel

So, what audio file extensions/mimetypes are more used by chat apps? opus, aac, flac?

Sep 22 '22 02:09 lfcnassif

IPED could have a flag/bookmark so you could generate your bookmarks.iped and set flags in files you want to transcript, then run it to generate your report. Or just flag files you want to transcript and reprocess only those files (and update chats). I dont know if IPED could do it incremental.

Oct 13 '22 22:10 rafael844

IPED could have a flag/bookmark so you could generate your bookmarks.iped and set flags in files you want to transcript, then run it to generate your report.

You might enable transcription before generating the report, so it will run just on bookmarked audios sent to report. Chats won't be updated, since this use case was not the original goal of the feature. There are some challenges I already described in #696. If you could help, contributions are very welcome.

Or just flag files you want to transcript and reprocess only those files (and update chats). I dont know if IPED could do it incremental.

This depends on the non trivial #24.

Oct 13 '22 23:10 lfcnassif

What recorded audio name patterns and extensions are more common out there?

Any suggestions about this?

Jan 24 '23 19:01 lfcnassif

In cases where there are a lot of MP3 music file, I would like to avoid audio/mpeg mimetypes. So I tried some mimetypes to comprehend the voice file sent in chat apps:

mimesToProcess = audio/3gpp; audio/aac; audio/aiff; audio/amr; audio/mp4; audio/ogg; audio/qcelp; audio/wav; audio/webm; audio/x-caf; audio/x-ms-wma; audio/x-opus+ogg

Jan 30 '23 05:01 aberenguel

In cases where there are a lot of MP3 music file, I would like to avoid audio/mpeg mimetypes. So I tried some mimetypes to comprehend the voice file sent in chat apps:
mimesToProcess = audio/3gpp; audio/aac; audio/aiff; audio/amr; audio/mp4; audio/ogg; audio/qcelp; audio/wav; audio/webm; audio/x-caf; audio/x-ms-wma; audio/x-opus+ogg

Great! I'll use your list then, thank you!

Jan 31 '23 02:01 lfcnassif

Reopening, some important audio mimes are being missed, a wrong mime was configured or it is an alias, we should use the normalized version.

Feb 14 '23 18:02 lfcnassif