ast
ast copied to clipboard
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
Hello, I have learned from the example of extracting features from speech using the AST model. I mimicked this example to extract features from new speech using my own model,...
I am a graduate student from China, and our team recently had the privilege of studying your article on the 'Audio Spectrogram Transformer'. We were truly impressed by the content...
Hi, @YuanGongND, I have trained ast with audioset for 36000 Iterations, the validate mAP is 0.029, is this right?Looking forward to your reply.
Dear Minister Gong, I wanted to express my gratitude for your work on AST; it has truly been an inspiration to me. I can confidently say that AST has served...
I try to use the feature extractor on my audiofiles. My audio files are all 16000Hz and 5 seconds long. The `waveform.shape[1]` is 80000 ```python input_values = feature_extractor(waveform, sampling_rate=16000, return_tensors="pt").input_values...
Hi I'm attempting to reproduce the performance metrics of models using HuggingFace's Pipeline utility, but I'm encountering different results. Below is the Python code I used for testing: ```python import...
Hello Yuan,I'm delighted to read your paper and reproduce your work.And I encounter some problems. When empolying the audioset_pretrain,why does the stride as same as the patch_size(overlap == 0). In...
When running, the following error occurs and the number of training sessions is limited to 50. Is it necessary to save csv? I am wondering if there is a way...