audio-understanding topic

List audio-understanding repositories

Awesome-Omni-Large-Models-and-Datasets

20
Stars
0
Forks
20
Watchers

🔥 Omni large models and datasets for understanding and generating multi-modalities.

Fun-ASR

449
Stars
27
Forks
449
Watchers

Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

VideoAgent

336
Stars
47
Forks
336
Watchers

"VideoAgent: All-in-One Agentic Framework for Video Understanding, Editing, and Remaking"

DIFFA

40
Stars
2
Forks
40
Watchers

[AAAI 2026] DIFFA: Large Language Diffusion Models Can Listen and Understand

FlexSED

37
Stars
2
Forks
37
Watchers

open-vocabulary sound event detection

Voxtral-AI-Demo-Local-Interface

27
Stars
2
Forks
27
Watchers

Voxtral is a state-of-the-art model developed to handle both speech transcription and audio understanding with remarkable accuracy and efficiency. This demo interface lets you run the Voxtral model o...