speaker-transcription
speaker-transcription copied to clipboard
Transcription with speaker diarization pipeline
Unsure if this is a replicate issue not. I run the model on in a 8 minute mp3 via the api which ran fine in the expected amount of time....
Hey you, thank you for the package :) I'm researching around how to improve diarization errors related to overlapping speech, and I'd like to ask you about your choice of...
I run an M2 Macbook and attempted to run this on my computer. The docker couldn't run because it failed to find a GPU (it was looking for Nvidia), and...
I get this error when I call the Meronym API (hosted on a remote server) with a file of more than 100mb. Is there an env var I could assign...
I am trying to use this with a large audio input (3.5 hours or so). Since the GPU it uses is fixed, replicate.com fails with: `Prediction failed for an unknown...
What does it takes for doing this but for spanish language? If you outline the steps I need to do, I will give it a shot
I have not been able to find much information on what the `speakers.embeddings.` signifies. For example, some example output from this model: ```json { "segments": [...], "speakers": { "count": 2,...
Similar to #2 - it would be great if we could specify the `whisper` model to use: `large/base` etc. Using smaller models would probably be fine for my workflow right...