obsidian-copilot
obsidian-copilot copied to clipboard
Support Audio Transcription
Context
Users have a lot of notes stored in Audio format. We should support transcribing these audio so they can digest them in Obsidian Copilot.
Note that this request is only about transcribing existing audio files in Obsidian Vault. Audio recording is out of the scope of this request.
Implementation
Need to support transcribing audio in the copilot backend. Reuse doc4llm or creating a new endpoint?
Accessing to STT models
- This can be included in Plus subscription?
Caching
- I think maybe we can cache the transcription result in user vault as MD note.
- What would the frontend UX be?
- We should also think about enabling offline mode for this
What would the frontend UX be?
My current thinking:
- In setting page, we provide a toggle for auto-transcribing, and also options of where to store the transcripts.
- In auto-transcribing mode, any audio file added to the vault will be transcribed automatically.
- Users can also manually trigger transcribing via a button in context menu.
- Users can also directly add audio files as chat context - we store the transcript in the local cache.
We should also think about enabling offline mode for this
Makes sense. Audio files can often contain private information. We can support self-hosted STT models like Whisper.