Wav2vec2.0 Script add
What does this PR do ?
Adds a script to load Wav2Vec2.0 weights from Fair to NeMo implementation. Also adjusts NeMo implementation to be similar to Fair's.
Collection: [ASR]
Changelog
- Added script to download and load state_dict from Fairseq Wav2Vec models to NeMo equivalents
- Adjusted order of normalization operations for wav2vec preprocessing
- Changed Wav2Vec module config so that projection operation of features is done in transformer encoder
- Altered contrastive loss so config can provide option to not combine groups. (Just added an optional boolean)
Before your PR is "Ready for review"
Pre checks:
- [y ] Make sure you read and followed Contributor guidelines
- [ n] Did you write any new necessary tests?
- [ n] Did you add or update any necessary documentation?
- [ n] Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- [ ] Reviewer: Does the PR have correct import guards for all optional libraries?
PR Type:
- [y ] New Feature
- [ y] Bugfix
- [ ] Documentation
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information
- Related to # (issue)
This pull request introduces 2 alerts when merging a5a0e55a512d667b27adf1f63ba239aeab225748 into e67c4ca29a3855d1575f173af5b38ed3a9a91e68 - view on LGTM.com
new alerts:
- 1 for Unused local variable
- 1 for Unused import
This pull request introduces 1 alert when merging 94fb6c4e11a11021ce50ced7b81cd219a3db089a into e67c4ca29a3855d1575f173af5b38ed3a9a91e68 - view on LGTM.com
new alerts:
- 1 for Unused import
This pull request introduces 1 alert when merging 82aff032908a4ddeb64e73905721643cfbdcd96c into e67c4ca29a3855d1575f173af5b38ed3a9a91e68 - view on LGTM.com
new alerts:
- 1 for Unused import
This pull request introduces 1 alert when merging ea4299037f1df68499d93281ae4d401e9ab9e533 into e67c4ca29a3855d1575f173af5b38ed3a9a91e68 - view on LGTM.com
new alerts:
- 1 for Unused import
@sam1373
I can take the script out if it's not necessary or want to expand it to also include the CTC models once those are checked out. But the changes to the Wav2Vec modules should address the discrepancies. I also don't really have much way to test this beyond simply the dict_loading being successful under strict=True conditions.
Ideally we'd want to add the ctc models to be able to test the wer, or at least try fine-tuning the pre-trained checkpoints to see that we get reasonable results, otherwise we don't really know if it works.
Okay, I'll move to draft until then.
This pull request introduces 1 alert when merging 995f916adeb1f5d18afd997a2b45707705cfa320 into ce16320c8c3e39de6c1d7da5def21a5455d8bd13 - view on LGTM.com
new alerts:
- 1 for Unused import
This pull request introduces 1 alert and fixes 1 when merging 98df79a113d0ba5f63f021a19d734835f6f33c02 into df335fe0ec110b0846521580734da447da06a24e - view on LGTM.com
new alerts:
- 1 for Unused import
fixed alerts:
- 1 for Unused import
This pull request introduces 1 alert and fixes 1 when merging 6b6107caec140c79fc9a3931ec20825201c0a9aa into 987674e29ea90f9a2f663bf95d74bd947d76bbc0 - view on LGTM.com
new alerts:
- 1 for Unused import
fixed alerts:
- 1 for Unused import
This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.