NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

add fastconformer timestamp (reopen)

Open biscayan opened this issue 1 year ago • 13 comments

What does this PR do ?

I closed a pull request #8341 because of lots of conflicts in the branch. I tried to solve, but finally I deleted the branch, made new one, and add sign-off to the commit. Sorry for my mistakes.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Jenkins CI

To run Jenkins, a NeMo User with write access must comment jenkins on the PR.

Before your PR is "Ready for review"

Pre checks:

  • [ ] Make sure you read and followed Contributor guidelines
  • [ ] Did you write any new necessary tests?
  • [ ] Did you add or update any necessary documentation?
  • [ ] Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • [ ] Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • [ ] New Feature
  • [ ] Bugfix
  • [ ] Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

@titu1994 @nithinraok @tango4j Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

biscayan avatar Feb 16 '24 02:02 biscayan

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions[bot] avatar Mar 02 '24 01:03 github-actions[bot]

@biscayan

Do you remember how you come up with the model_stride_sec = 0.08? Link to the line With 0.08 sec, fast conformer is not yielding the right timestamp. It is generating slightly longer sample length.

This needs more investigation on time-stamp generation on FastConformer Model.

tango4j avatar Mar 05 '24 23:03 tango4j

@tango4j Thank you for reviewing. In the fastconformer paper, it is explained that fastconformer model has downsampling schema with 80ms frame rate. It means that the time of one frame is 80ms. I tested the extracted timestamps while listening the wav file. When I set the model_stride_sec=0.04, the timestamps do not match with the wavfile, and after changing the model_stride_sec from 0.04 to 0.08, timestamps accurately match with the wavfile

biscayan avatar Mar 06 '24 01:03 biscayan

jenkins

nithinraok avatar Mar 08 '24 19:03 nithinraok

@biscayan Did you check the cpWER ? or you just checked DER ? When I tested with CallHome data, the fast conformer was showing 50% ish cpWER while other CTC models are showing 18~20% errors. I think there is some issues with chunk-based ASR system.

tango4j avatar Mar 08 '24 23:03 tango4j

@tango4j I didn't compute the cpWER for fastconformer model. I just compared the asr result and the transcript in clean speech data, and the result is almost accurate. However, I also faced the problem in real reverberated meeting data, the model has low asr performance.

biscayan avatar Mar 11 '24 00:03 biscayan

@tango4j what are other CTC models you were comparing, are both of similar sizes?

nithinraok avatar Mar 11 '24 02:03 nithinraok

@nithinraok Model size does not affect the timestamp issue with chunk based ASR. Only fast-conformer CTC models are returning the inaccurate timestamps with chunk based ASR. We use ConformerCTC large stt_en_conformer_ctc_large for test purpose.

tango4j avatar Mar 11 '24 19:03 tango4j

@tango4j Should I do something more for the review?

biscayan avatar Mar 15 '24 07:03 biscayan

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions[bot] avatar Mar 30 '24 01:03 github-actions[bot]

This PR was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Apr 06 '24 01:04 github-actions[bot]

@KunalDhawan pls test this PR on a known set to compare against similar sized conformer model

nithinraok avatar May 08 '24 18:05 nithinraok

Also add a integration test for testing purposes! testing on English dataset is good enough.

nithinraok avatar May 08 '24 18:05 nithinraok

Hi @KunalDhawan Thank you for your report. I will develop the PR later. Thank you.

biscayan avatar May 21 '24 23:05 biscayan