now icon indicating copy to clipboard operation
now copied to clipboard

feat: add audio-text search

Open slettner opened this issue 2 years ago • 0 comments

Audio-Text Search

This feature adds audio as a new modality to jina now. It introduces an audio-text bi-modal search scenario, showcased using one demo dataset with music data. Further audio case might be added later using, e.g. environmental sounds which also have good support by pre-trained models (e.g. audio-clip).

The following is a running list of required changes to realize the audio-case in jina now:

  • [ ] add case hierarchy (top-level is modality-combination, second-level is the specific data set) to the cli dialog (#107)
  • [x] prepare dataset for music-text demo and upload to storage
  • [x] prepare custom executor for the demo case and push to hub
  • [x] update data loading logic to work with new dataset (#120)
  • [ ] update fine-tuning logic to work with audio data
  • [ ] update frontend app to work with the audio-text data (user can search with text or chose pre-selected songs to search similar to the existing image case)

slettner avatar Apr 19 '22 08:04 slettner