Taejin Park comments

Results 36 comments of


                                            Taejin Park

Speaker Diarization with Marblenet and ClusterDiarizer issue

Seems like this is not a bug/issue in NeMo pipeline (changed to question), in the worst case this could be the limit of model performance.

Optimization difference of PyTorch and Tensorflow

Hi. I think tensorflow will be still faster than pytorch while results vary over dataset and architecture. You might want to test with various examples. If you are not using...

Tutorials and Docs for Multi-scale Diarization Decoder

To do list before open this PR: - Add CI test for msdd inference - Update the minor mistake on figures in diarizaiton/model - Spell check on docs/tutorials

Speaker Diarization - Audio Length Limit

Speaker diarization can be done on an audio file of 90mins, but it could take few minutes to get the result since clustering algorithm is relying on eigenvalue decomposition (O(n^3)...

NeMo MSDD miss detection

@SagyHarpazGong Unfortunately, it is quite hard to correctly count the number of speakers if there are relatively short speech from speakers. Speaker diarization is also a type of pattern recognition...

Fixed chokepoint in diarization for longer audios

Thank you for suggesting this idea. Currently I don't have bandwidth to test this so let me check on this as soon as I have bandwidth to take care of...

Unable to export diar_msdd_telephonic (neural diarizer) model to onnx format

NeMo neural diarization class (clustering+MSDD) is not ONNX exportable. This is because pytorch linear algebra library is not fully ONNX exportable. We are working on fully neural diarization and only...

add fastconformer timestamp (reopen)

@biscayan Do you remember how you come up with the model_stride_sec = 0.08? [Link to the line](https://github.com/biscayan/NeMo/blob/9f50dbdabed105d2649384e403c71147d57b4bae/nemo/collections/asr/parts/utils/decoder_timestamps_utils.py#L357C1-L358C1) With 0.08 sec, fast conformer is not yielding the right timestamp. It is...

add fastconformer timestamp (reopen)

@biscayan Did you check the cpWER ? or you just checked DER ? When I tested with CallHome data, the fast conformer was showing 50% ish cpWER while other CTC...

add fastconformer timestamp (reopen)

@nithinraok Model size does not affect the timestamp issue with chunk based ASR. Only fast-conformer CTC models are returning the inaccurate timestamps with chunk based ASR. We use ConformerCTC large...