Taejin Park

Results 36 comments of Taejin Park

Seems like this is not a bug/issue in NeMo pipeline (changed to question), in the worst case this could be the limit of model performance.

Hi. I think tensorflow will be still faster than pytorch while results vary over dataset and architecture. You might want to test with various examples. If you are not using...

To do list before open this PR: - Add CI test for msdd inference - Update the minor mistake on figures in diarizaiton/model - Spell check on docs/tutorials

Speaker diarization can be done on an audio file of 90mins, but it could take few minutes to get the result since clustering algorithm is relying on eigenvalue decomposition (O(n^3)...

@SagyHarpazGong Unfortunately, it is quite hard to correctly count the number of speakers if there are relatively short speech from speakers. Speaker diarization is also a type of pattern recognition...

Thank you for suggesting this idea. Currently I don't have bandwidth to test this so let me check on this as soon as I have bandwidth to take care of...

NeMo neural diarization class (clustering+MSDD) is not ONNX exportable. This is because pytorch linear algebra library is not fully ONNX exportable. We are working on fully neural diarization and only...

@biscayan Do you remember how you come up with the model_stride_sec = 0.08? [Link to the line](https://github.com/biscayan/NeMo/blob/9f50dbdabed105d2649384e403c71147d57b4bae/nemo/collections/asr/parts/utils/decoder_timestamps_utils.py#L357C1-L358C1) With 0.08 sec, fast conformer is not yielding the right timestamp. It is...

@biscayan Did you check the cpWER ? or you just checked DER ? When I tested with CallHome data, the fast conformer was showing 50% ish cpWER while other CTC...

@nithinraok Model size does not affect the timestamp issue with chunk based ASR. Only fast-conformer CTC models are returning the inaccurate timestamps with chunk based ASR. We use ConformerCTC large...