Abhishree Thittenamane comments

Results 67 comments of


                                            Abhishree Thittenamane

Add support for distributed optimizer

> @athitten there were still some whitespace errors that I have fixed. Some mypy checks were failing that I needed to familiarize myself with and then fix. Otherwise, just waiting...

Add support for distributed optimizer

> @jeffdaily @athitten just came across this...looks like we forgot to merge this. Will this PR cause our fork to diverge from upstream in this regard ,and is that what...

Implement TensorBase.is_cuda

@k223kim I din't know support for `is_cuda` already existed. Probably the thunder I was using din't have it yet.

Adding the updated command to use `megatron_amp_O2=True` and `model.mcore_gpt = True` (NeMo models will be defaulting to using models from Megatron, hence this setting). With `megatron_amp_O2=True`, having `precision=bf16` should do...

Support NeMo NeVA Model

This might be helpful: The full config with default values for all parameters can be found: [here](https://github.com/NVIDIA/NeMo/blob/main/examples/multimodal/multimodal_llm/neva/conf/neva_config.yaml). Only the parameters we specify in the run command get overwritten by the...

Support NeMo NeVA Model

Yes its important to prioritize getting thunder working with `mcore_gpt=True` as it will be default for NeMo models once we deprecate the legacy path.

Add REST API to deploy module

> Were you able to run this code successfully? Testing that currently, the rest API end points are visible and `get` method works successfully. However, `post` method with /v1/completions/ end...

Add REST API to deploy module

> > Were you able to run this code successfully? > > Testing that currently, the rest API end points are visible and `get` method works successfully. However, `post` method...

TypeError: Missing a required argument with thunder.jit in NeMo SD ResBlock

The same error comes from adding thunder.jit to the subsequent [ResBlock here](https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py#L825)

NVFuser error adding thunder.jit to UNet model of NeMo Stable Diffusion

@xwang233 to build nemo on top of pjnl used the pjnl container (gitlab-master.nvidia.com:5005/dl/pytorch/update-scripts:pjnl-latest) from last Friday. Is the error fixed in the newest version of pjnl container ? I can...