obsidian-copilot icon indicating copy to clipboard operation
obsidian-copilot copied to clipboard

Support Audio Transcription

Open wenzhengjiang opened this issue 5 months ago • 2 comments

Context

Users have a lot of notes stored in Audio format. We should support transcribing these audio so they can digest them in Obsidian Copilot.

Note that this request is only about transcribing existing audio files in Obsidian Vault. Audio recording is out of the scope of this request.

Implementation

Need to support transcribing audio in the copilot backend. Reuse doc4llm or creating a new endpoint?

Accessing to STT models

  • This can be included in Plus subscription?

Caching

  • I think maybe we can cache the transcription result in user vault as MD note.

wenzhengjiang avatar Jul 07 '25 22:07 wenzhengjiang

  • What would the frontend UX be?
  • We should also think about enabling offline mode for this

logancyang avatar Jul 08 '25 18:07 logancyang

What would the frontend UX be?

My current thinking:

  • In setting page, we provide a toggle for auto-transcribing, and also options of where to store the transcripts.
    • In auto-transcribing mode, any audio file added to the vault will be transcribed automatically.
  • Users can also manually trigger transcribing via a button in context menu.
  • Users can also directly add audio files as chat context - we store the transcript in the local cache.

We should also think about enabling offline mode for this

Makes sense. Audio files can often contain private information. We can support self-hosted STT models like Whisper.

wenzhengjiang avatar Jul 09 '25 20:07 wenzhengjiang