ast icon indicating copy to clipboard operation
ast copied to clipboard

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Results 55 ast issues
Sort by recently updated
recently updated
newest added

![捕获](https://user-images.githubusercontent.com/88529547/182419468-7326edfd-9d7c-404b-b314-c20e63ef67df.PNG)

bug

Hi, sorry to bother you. Why are the two special [CLS]tokens in DeiT said to be average as a single [CLS] token in the paper, but in the code I...

question

Thanks for making such a wonderful repo. When I run the model, the validation loss is very high(around 0.6940), but the mAP is keep increasing normally. Could you please explain...

bug

Hi Yuan, I have tested your AST mode pretrained on Audioset on my own dataset and I noticed that it achieves similar performance as EfficientNet pretrained on Audioset using psla...

enhancement

Avoid an incompatible torchvision dependency on torch 1.11.0 Fixes #68

Following setup instructions at present will lead to an error on the model import. Specifically, `torchvision` needs to be pinned to a version. Otherwise, `pip` may install (e.g.) 0.12.0, which...

bug

Hello! First of all, congratulations on your amazing work. I'm doing my MSc Thesis on audio classification (respiratory disease diagnosis from lung sounds). My main objective is to improve the...

question

Hi For audio of different lengths, the padding operation in dataset is taken on `fbank`. So why not padding on waveform first and then convert it to `fbank`.

question

Hi, Dr.Gong, I use AST on my own dataset. I have created the .json file and .csv file according to the guide. However, when I run run.sh, an error occured...

Hi, YuanGong, Does it support distribution training with multi GPU on different machines? looking forward to you reply, thanks!