Tom Fogal comments

Results 82 comments of


                                            Tom Fogal

Implement TensorBase.div_

98% sure Masaki's recent PR implemented this. Assigning to Masaki to either close or comment on status.

Enable cuDNN executor by default

> cudnn executor is overly optimistic when claiming sdpa. ahh; yes, we should fix that first. Thanks for remembering this! > sdpa checker function can throw errors with a pseudo...

Support NeMo MegatronImagen network

Quick update: I talked to the NeMo team and Eric had the (reasonable) concern that swapping in-place for out-of-place might increase memory consumption. The onus is on me to give...

Implement torch.device

would like to discuss at triage review: can we just say `cuda` for everything? what about tensors from the outside (i.e. input tensors) that are on `cpu` or even things...

Implement _set_grad_enabled of torch._C

triage team: looking to understand if this is high-effort or low-effort (honestly, looking for that on all NeMo things, this one's just particularly out of my depth).

Implement _set_grad_enabled of torch._C

> model may need to be revised to target thunder I'm not sure "revise the model" is going to be a reasonable solution in the general case; already seen this...

Implement _VariableFunctionsClass.randint of torch

triage team: what guarantees do we need to provide w.r.t. the generated random numbers? i.e. do we need to match torch exactly, match the same distribution, merely respect the `low`/`high`...

Implement TensorBase.is_cuda

Closing based on discussion above.

Implement _VariableFunctionsClass.baddbmm of torch

FWIW it looks like [NeMo always sets `beta` to `0.0`](https://github.com/NVIDIA/NeMo/blob/8e65042d15062ce3fbe639f9d428c639510d894c/nemo/collections/nlp/modules/common/megatron/attention.py#L947).

Support NeMo NeVA Model

> Can you share the script for the `examine` call? @athitten when you have a minute