audio-transformers-course
audio-transformers-course copied to clipboard
Suggestion to add shape info in preprocessing
In the section about preprocessing, it would be useful to add type/shape information of data produced after pre processing the data.
Specifically, https://github.com/huggingface/audio-transformers-course/blob/ac81306fb8822fa8c4e2a43748be8ba31d8bb043/chapters/en/chapter1/preprocessing.mdx#L186 here it be very useful to add as a comment what is the type/shape of input_features
. Is it 3d array of floats like [time, freq, ampl] ?
+1 to this.
Either a comment or a cell with an output that shows the entries and type/shape of input_features
, I encountered this and tried to incorporate a cell for better visualization in my notebooks.
I also think, that if this is introduced in initial lessons, it would be of great help and relevance for the next lessons.
I found a very useful link in the transformer docs on data preprocessing here which mentions how padding
and truncation
can vary the shape of input_features
to make every audio sample of the same size. See the image below:
cc: @sanchit-gandhi