NeMo move transformer from nlp to common

Since transformers are also used in some ASR related projects, we try to move the transformer package from nlp to common collection. Some files are left in the nlp collection to avoid breaking old code.

Aug 18 '22 15:08 stevehuang52

This pull request introduces 1 alert and fixes 6 when merging 0f1f76f6e5a97a786372ded63461b133a3cd9e9f into 8845addc6562f2df3740c95ee496a26849b526d0 - view on LGTM.com

new alerts:

1 for Unused import

fixed alerts:

6 for Unused import

Aug 18 '22 15:08 lgtm-com[bot]

This pull request introduces 1 alert and fixes 6 when merging dca10796bd8d15914a036041d4d3ebd6c53145aa into 3ba267290927b1a2e78d82c7afbc538874c01bcc - view on LGTM.com

new alerts:

1 for Unused import

fixed alerts:

6 for Unused import

Aug 22 '22 13:08 lgtm-com[bot]

@stevehuang52, sorry for the delay in reviewing the PR. There's currently a lot of design questions around transformers that we are discussing.

Have you tried using the NeMo Megatron transformer here: https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/nlp/modules/common

We have had a big effort to move megatron transformers to NeMo and we've also moved NeMo NMT to these transformers. Is it possible for you to use them?

Aug 29 '22 21:08 ericharper

@stevehuang52, sorry for the delay in reviewing the PR. There's currently a lot of design questions around transformers that we are discussing.

Have you tried using the NeMo Megatron transformer here: https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/nlp/modules/common

We have had a big effort to move megatron transformers to NeMo and we've also moved NeMo NMT to these transformers. Is it possible for you to use them?

@ericharper Do you mean the megatron transformers here https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/nlp/modules/common/megatron? Any correctly implemented transformer should work, but I also need the sequence generator https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/nlp/modules/common/transformer/transformer_generators.py. Do you know if there is anything equivalent in the megatron transformers?

Aug 29 '22 21:08 stevehuang52

@stevehuang52 we'd like to "deprecate" non-Megatron transformers in NeMo. Can you please have a look at whether you can use those?

Aug 29 '22 23:08 okuchaiev

Megatron transformers requires apex, I'd like to avoid that as much as possible for ASR. @stevehuang52 please try to see if the ordinary transformer blocks will work for your use case (which don't depend on apex)

Aug 30 '22 04:08 titu1994

And why should we deprecate ordinary attention blocks ? It is used in more domains than just NLP, and not all domains require the apex library.

Aug 30 '22 04:08 titu1994

@stevehuang52 we'd like to "deprecate" non-Megatron transformers in NeMo. Can you please have a look at whether you can use those?

@okuchaiev Do Megratron transformers have sequence generator similar to non-Megatron transformers?

Aug 30 '22 13:08 stevehuang52

Megatron transformers requires apex, I'd like to avoid that as much as possible for ASR. @stevehuang52 please try to see if the ordinary transformer blocks will work for your use case (which don't depend on apex)

I think I'm using ordinary transformers now. I can also use pytorch transformers if we don't want to move the nemo transformers, but I need to figure out a replacement for the sequence generators.

Aug 30 '22 13:08 stevehuang52

This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

Oct 06 '22 02:10 github-actions[bot]

NeMo NeMo copied to clipboard

move transformer from nlp to common

NeMo
NeMo copied to clipboard