diart icon indicating copy to clipboard operation
diart copied to clipboard

quality concerns

Open DmitriyG228 opened this issue 1 year ago • 1 comments

It looks like pipeline quickly forgets previous speakers, assigning wrong tags to new ones, so that a conversation of 4-5 people being inferenced as a conversation of 2.

I am testing alongside with whisperx, which seem to be using same set of default models, though gives better results.

Before diving deeper into the debugging, is there an obvious number of things I could be doing wrong? I tried non-default embedding model with same result.

DmitriyG228 avatar Jan 06 '24 15:01 DmitriyG228

@DmitriyG228 you can check out other related issues like #4, #133 and #226 where this was already discussed

juanmc2005 avatar Feb 02 '24 15:02 juanmc2005