audio-transformers-course icon indicating copy to clipboard operation
audio-transformers-course copied to clipboard

Suggestion to add shape info in preprocessing

Open mishig25 opened this issue 1 year ago • 2 comments

In the section about preprocessing, it would be useful to add type/shape information of data produced after pre processing the data.

Specifically, https://github.com/huggingface/audio-transformers-course/blob/ac81306fb8822fa8c4e2a43748be8ba31d8bb043/chapters/en/chapter1/preprocessing.mdx#L186 here it be very useful to add as a comment what is the type/shape of input_features. Is it 3d array of floats like [time, freq, ampl] ?

mishig25 avatar Jul 17 '23 10:07 mishig25

+1 to this. Either a comment or a cell with an output that shows the entries and type/shape of input_features, I encountered this and tried to incorporate a cell for better visualization in my notebooks. I also think, that if this is introduced in initial lessons, it would be of great help and relevance for the next lessons. I found a very useful link in the transformer docs on data preprocessing here which mentions how padding and truncation can vary the shape of input_features to make every audio sample of the same size. See the image below:

image

snehilsanyal avatar Sep 22 '23 06:09 snehilsanyal

cc: @sanchit-gandhi

mishig25 avatar Sep 22 '23 08:09 mishig25