obsidian-copilot Support Audio Transcription

Context

Users have a lot of notes stored in Audio format. We should support transcribing these audio so they can digest them in Obsidian Copilot.

Note that this request is only about transcribing existing audio files in Obsidian Vault. Audio recording is out of the scope of this request.

Need to support transcribing audio in the copilot backend. Reuse doc4llm or creating a new endpoint?

Jul 07 '25 22:07 wenzhengjiang

Jul 08 '25 18:07 logancyang

What would the frontend UX be?

My current thinking:

In setting page, we provide a toggle for auto-transcribing, and also options of where to store the transcripts.
- In auto-transcribing mode, any audio file added to the vault will be transcribed automatically.
Users can also manually trigger transcribing via a button in context menu.
Users can also directly add audio files as chat context - we store the transcript in the local cache.

We should also think about enabling offline mode for this

Makes sense. Audio files can often contain private information. We can support self-hosted STT models like Whisper.

Jul 09 '25 20:07 wenzhengjiang