NeMo
NeMo copied to clipboard
move transformer from nlp to common
Since transformers are also used in some ASR related projects, we try to move the transformer package from nlp to common collection. Some files are left in the nlp collection to avoid breaking old code.
This pull request introduces 1 alert and fixes 6 when merging 0f1f76f6e5a97a786372ded63461b133a3cd9e9f into 8845addc6562f2df3740c95ee496a26849b526d0 - view on LGTM.com
new alerts:
- 1 for Unused import
fixed alerts:
- 6 for Unused import
This pull request introduces 1 alert and fixes 6 when merging dca10796bd8d15914a036041d4d3ebd6c53145aa into 3ba267290927b1a2e78d82c7afbc538874c01bcc - view on LGTM.com
new alerts:
- 1 for Unused import
fixed alerts:
- 6 for Unused import
@stevehuang52, sorry for the delay in reviewing the PR. There's currently a lot of design questions around transformers that we are discussing.
Have you tried using the NeMo Megatron transformer here: https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/nlp/modules/common
We have had a big effort to move megatron transformers to NeMo and we've also moved NeMo NMT to these transformers. Is it possible for you to use them?
@stevehuang52, sorry for the delay in reviewing the PR. There's currently a lot of design questions around transformers that we are discussing.
Have you tried using the NeMo Megatron transformer here: https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/nlp/modules/common
We have had a big effort to move megatron transformers to NeMo and we've also moved NeMo NMT to these transformers. Is it possible for you to use them?
@ericharper Do you mean the megatron transformers here https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/nlp/modules/common/megatron? Any correctly implemented transformer should work, but I also need the sequence generator https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/nlp/modules/common/transformer/transformer_generators.py. Do you know if there is anything equivalent in the megatron transformers?
@stevehuang52 we'd like to "deprecate" non-Megatron transformers in NeMo. Can you please have a look at whether you can use those?
Megatron transformers requires apex, I'd like to avoid that as much as possible for ASR. @stevehuang52 please try to see if the ordinary transformer blocks will work for your use case (which don't depend on apex)
And why should we deprecate ordinary attention blocks ? It is used in more domains than just NLP, and not all domains require the apex library.
@stevehuang52 we'd like to "deprecate" non-Megatron transformers in NeMo. Can you please have a look at whether you can use those?
@okuchaiev Do Megratron transformers have sequence generator similar to non-Megatron transformers?
Megatron transformers requires apex, I'd like to avoid that as much as possible for ASR. @stevehuang52 please try to see if the ordinary transformer blocks will work for your use case (which don't depend on apex)
I think I'm using ordinary transformers now. I can also use pytorch transformers if we don't want to move the nemo transformers, but I need to figure out a replacement for the sequence generators.
This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.