Hervé BREDIN issues

Results 84 issues of


                                            Hervé BREDIN

Too many SPEAKER_XX labels?

The screenshot shows the UI before I edit anything. The model identified one speaker (leonard) and left another one unidentified? Why is there so many SPEAKER_XX? (speaker_00 to speaker_06)?

wontfix

Prodi.gy

[`parallel`](https://www.gnu.org/software/parallel/) prints the following message when you run it ``` Academic tradition requires you to cite works you base your article on. If you use programs that use GNU Parallel...

Better handle speakers at chunk boundaries in pyannote.diarization

I am wondering whether it would be a good idea to actually overlap chunks a bit. Instead of showing [10, 20] -> [20, 30], we could actually show [9, 21]...

wontfix

Prodi.gy

Wrong sample rate in audio snippet

wontfix

Prodi.gy

Handle saving/loading of speaker embedding in pyannote.diarization recipe

We need to discuss how speaker embeddings can be reused/reloaded when relaunching Prodigy on an existing file. How/where are embeddings stored as numpy array on disk

wontfix

Prodi.gy

Logging user interaction in pyannote.diarization recipe

To evaluate the impact of various design choices, we should log the user interactions and store them into the Prodigy database. For instance, we should log: * how many times...

wontfix

Prodi.gy

Useful multi-label models

I'd like to pretrain a couple of models to host then on hugginface along with the others. What kind of classes/train datasets would you suggest? I was thinking about `MALE`/`FEMALE`...

Interactive pyannote.audio.utils.preview

Using the following piece of code in a Jupyter Notebook will create a nice video where audio is synchronized with the annotation ```python from pyannote.audio.utils.preview import preview from pyannote.core import...

wontfix

Investigate torchaudio streaming API

https://github.com/pytorch/audio/issues/1442#issuecomment-1032358815

wontfix

Huggingface Inference API and widget

[pyannote.audio 2.0](https://github.com/pyannote/pyannote-audio/tree/develop) will bring a unified pipeline API: ```python from pyannote.audio import Pipeline pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization") output = pipeline("audio.wav") # or pipeline({"waveform": np.ndarray, "sample_rate": int}) ``` where `output` is a...

wontfix