CLIPPyX
CLIPPyX copied to clipboard
CLAPPyX (audio version of this)
I think a discussion post would be more appropriate for this, but discussions are not open at the moment of writing this.
Have you considered making a version of this repo that works with audio files instead of images? My understanding is that replacing CLIP by CLAP should give a similar performance but for audio files instead.
Very good idea, maybe work for videos too. I'm also thinking of a broader way to apply it. Transcribe the video using Whisper ➡️ Get the text (with corresponding timestamps). So we can search and get the exact time stamp? It will be useful for lectures imo