He Huang (Steve) comments

Results 22 comments of


                                            He Huang (Steve)

[Fix] schedulers with no max_steps param

> Hey so with this pr the I don't need to define max_step param for any scheduler ? Not really, this PR aims to fix the bug that current code...

WIP Fix for #4455

The problem was already fixed in [this PR](https://github.com/NVIDIA/NeMo/pull/4470), closing this PR

move transformer from nlp to common

> @stevehuang52, sorry for the delay in reviewing the PR. There's currently a lot of design questions around transformers that we are discussing. > > Have you tried using the...

move transformer from nlp to common

> @stevehuang52 we'd like to "deprecate" non-Megatron transformers in NeMo. Can you please have a look at whether you can use those? @okuchaiev Do Megratron transformers have sequence generator similar...

move transformer from nlp to common

> Megatron transformers requires apex, I'd like to avoid that as much as possible for ASR. @stevehuang52 please try to see if the ordinary transformer blocks will work for your...

Add SpeechLM to main

@titu1994 @nithinraok could you please take another look to see if your comments have been addressed? Thanks~

Add SpeechLM to main

@titu1994 @zhehuaichen I've refactored the dataset such that the input and output keys can be configured dynamically by setting `context_key` and `answer_key` in the dataset. For example, if we want...

Add SpeechLM to main

@zhehuaichen FYI I removed the `random context training` trick from the dataset, since it only makes sense for word-boosting and not other tasks. It's better to actually generate those word-boosting...

Add SpeechLM to main

> > @zhehuaichen FYI I removed the `random context training` trick from the dataset, since it only makes sense for word-boosting and not other tasks. It's better to actually generate...

Add SpeechLM to main

jenkins