Yuan Gong
Yuan Gong
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
ssast
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
vocalsound
Dataset and baseline code for the VocalSound dataset (ICASSP2022).
psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
gopt
Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".
python-compute-eer
Simple Python script to compute equal error rate (EER) for machine learning model evaluation.
realtime-adversarial-attack
Code for IJCAI 2019 paper "Real-time Adversarial Attack".
ReMASC
ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems
whisper-at
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
ltu
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".