architecture Add enqueue option for TTS

Context

The need to make this have been raised to me in https://github.com/home-assistant/core/pull/36951

The goal of this change is to use the queuing capability already in place in the media_player entity type in order to chain TTS message with them-self or other sound.

This will allow to give more option in voice application and ease the api usage (usually limited) by making shorter and easily cachable "block" of text chain together. It could be done right now without, but have many drawback like having to detect when a player have finish the previous sound-clip (random timer or watching the state) or having long wait between each sound-clip for it to be generated.

see feature request https://community.home-assistant.io/t/append-media-function-for-media-player-component/168180

Proposal

I'm proposing to add the "enqueue" option to the TTS service, that if true will call the media_player.play_media with the already implemented but almost undocumented "enqueue" option. This option is right now available on sonos heos squeezebox and bluesound. The PR above also add it to MPD. Basically any media player with playlist capability could be added. For all player who isn't capable, it will be ignored and work as before

Consequences

Pro was explained in Context... because i'm bad at structuring a text, so this will be a repeat

This will allow to give more option in voice application and ease the api usage (usually limited) by making shorter and easily cachable "block" of text chain together. It could be done right now without, but have many drawback like having to detect when a player have finish the previous sound-clip (random timer or watching the state) or having long wait between each sound-clip for it to be generated.

Con is the added confusion of a feature not universally present. TTS is already pretty confusing to understand, but since not every media_player support the play_media function required for TTS, i feel like this issue is already one anyway.

Thank's for reading me

Jun 20 '20 20:06 Vaarlion

+1 This would be a valuable enhancement and I hope it is implemented.

Jun 29 '20 19:06 tdejneka

Yes, I would definitely find this helpful as well in order to avoid longer wait times between requesting a TTS message be played and it actually being played.

As an alternative to adding the messages directly onto the media player queue, it may make sense to add a function to TTS which get the message (and thus load it into cache) but don’t play it, which would still speed things up but not necessarily require a change to the full model as I understand

@frenck - wdyt?

Aug 08 '20 06:08 mountainsandcode

@mountainsandcode I think you should not ping random people. Thanks 👍

Aug 08 '20 10:08 frenck

Hello, It's been 2 years since the last entry in above discussion. Does anyone know, if there are any plans to have the discussion completed and feature implemented? Apologies in case I missed something.

Nov 27 '22 23:11 rafal-re

I'm not aware of it, i am stuck on an old version for now with my patch running, and i'm trying to fix the light template to upgrade.

But i beleave i've seen some code in master doing the tts enqueue ? Maybe it's already done, but skipping this completely

Nov 28 '22 10:11 Vaarlion

This architecture issue is old, stale, and possibly obsolete. Things changed a lot over the years. Additionally, we have been moving to discussions for these architectural discussions.

For that reason, I'm going to close this issue.

../Frenck

May 11 '23 14:05 frenck

architecture architecture copied to clipboard

Add enqueue option for TTS

Context

Proposal

Consequences

architecture
architecture copied to clipboard