KerasDeepSpeech icon indicating copy to clipboard operation
KerasDeepSpeech copied to clipboard

could you give some examples about the shape in below?

Open moses1994 opened this issue 7 years ago • 1 comments

3. input_length (required for CTC loss)

    # this is the time dimension of CTC (batch x time x mfcc)
    #input_length = np.array([get_xsize(mfcc) for mfcc in X_data])
    input_length = np.array(x_val)
    # print("3. input_length shape:", input_length.shape)   
    # print("3. input_length =", input_length)
    assert(input_length.shape == (self.batch_size,))

    # 4. label_length (required for CTC loss)
    # this is the length of the number of label of a sequence
    #label_length = np.array([len(l) for l in labels])
    label_length = np.array(y_val)
    # print("4. label_length shape:", label_length.shape)
    # print("4. label_length =", label_length)
    assert(label_length.shape == (self.batch_size,))

hi, I want to make a ctc demo, I do not know the "label_length.shape" and "input_length.shape", how to calculate them ? and what means them ? thanks you.

moses1994 avatar May 07 '18 14:05 moses1994

@moses1994 The shape is a member of numpy.array, which is a tuple representing the dimension of the array. shape of (2, 3) means a 2-dimentional matrix of 2x3. In this code, the label_length is an 1-dimensional array, and each element is the length of the transcript in the batch. So it's shape is (batch,). You don't need to calculate the shape of an array.

revive avatar Sep 03 '18 13:09 revive