keras-io What is the purpose of SpeechFeatureEmbedding class

trafficstars

So we have the Transformers input layer, that do the Input Embedding job and the position encoding context, but then a class like this is apply

class SpeechFeatureEmbedding(layers.Layer):
    def __init__(self, num_hid=64, maxlen=100):
        super().__init__()
        self.conv1 = tf.keras.layers.Conv1D(
            num_hid, 11, strides=2, padding="same", activation="relu"
        )
        self.conv2 = tf.keras.layers.Conv1D(
            num_hid, 11, strides=2, padding="same", activation="relu"
        )
        self.conv3 = tf.keras.layers.Conv1D(
            num_hid, 11, strides=2, padding="same", activation="relu"
        )
        self.pos_emb = layers.Embedding(input_dim=maxlen, output_dim=num_hid)

    def call(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        return self.conv3(x)

So my question is, why do we need to add conv1D neural networks to this? we already have a positional encoding don't we? What is the purpose of this...

is in this code https://github.com/keras-team/keras-io/blob/master/examples/audio/transformer_asr.py

Mar 08 '22 19:03 BernardoOlisan

To my understanding SpeechFeatureEmbedding only supplies the Encoder input, and TokenEmbedding supplies the Decoder input. However, it seems SpeechFeatureEmbedding doesn't do Positional Encoding. I'm not sure whether this is a bug, so I've asked for clarification .

Jan 01 '23 02:01 rshahamiriuoa

I have the same question as like @BernardoOlisan of SpeechFeatureEmbedding class defined self.pos_emb = layers.Embedding(input_dim=maxlen, output_dim=num_hid) but not used. I need clarification also.

Mar 30 '23 18:03 saiful9379

As mentioned in one of the comment SpeechFeatureEmbedding is for encoder input and TokenEmbedding is for decoder output. Conv layers are used tgenerally for audio.

Jul 23 '24 23:07 sachinprasadhs

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

Aug 07 '24 01:08 github-actions[bot]

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

Aug 23 '24 01:08 github-actions[bot]

Are you satisfied with the resolution of your issue? Yes No

Aug 23 '24 01:08 github-actions[bot]

keras-io keras-io copied to clipboard

What is the purpose of SpeechFeatureEmbedding class

keras-io
keras-io copied to clipboard