Joe Cummings comments

Results 278 comments of


                                            Joe Cummings

[debug] compile FSDP2 recipe

> @msaroufim Is there a method on NF4Tensor that could be easily implemented to enable FSDP2 + QLoRA + Compile? To make this even more explicit, we have 405B working...

Using torchao-nightly I am now able to run the above command with compile=True. However, I get the following output: ``` [rank0]:W0729 13:35:08.486000 917826 torch/_dynamo/convert_frame.py:795] [8/8] torch._dynamo hit config.cache_size_limit (8) [rank0]:W0729...

Llama3 qlora load_state_dict takes forever

> Not a torchtune author/contributor, but from the memory usage, I'm guessing that the old version performs NF4 quantization on GPU, while the new version performs it on CPU. Makes...

Llama3 qlora load_state_dict takes forever

Hi @l-berg - thanks for bringing this to our attention! The AO folks dug deep into this and saw that a version guarded inplace_copy function was the offending issue. Please...

[RFC] Optimizer CPU offload from torchao for single GPU low memory config

Closing this issue as it is now possible through the TorchAO library.

expandable_segments with PYTORCH_CUDA_ALLOC_CONF reduces VRAM

> Adding a comment to track this discussion in Pytorch core [pytorch/pytorch#130330](https://github.com/pytorch/pytorch/issues/130330) If this lands, we should enable this by default in torchtune until it lands in a PyTorch stable...

expandable_segments with PYTORCH_CUDA_ALLOC_CONF reduces VRAM

> Just wanted to confirm that running on A100, with the flag i can run bs=4, but without, it OOMs This would imply that we should be paying attention to...

Fix generation for bsz > 1

> Was this the kind of thing you had in mind? [`1129f9e`/torchtune/modules/rlhf/_generation.py](https://github.com/pytorch/torchtune/blob/1129f9e3a246628c991c246d81dbead62d3437a3/torchtune/modules/rlhf/_generation.py) Yep, this is pretty much it! I take it that you're not utilizing the KV Cache for this...

Fix generation for bsz > 1

Left padded: ``` My, name, is, Joe , Hello, world , , , Bye ``` Left padded mask: ``` 1 0 0 0 1 1 0 0 1 1 1...

[feature request] Profiler for other recipes

We started with those first two recipes in order to prove out the concept, but there's no reason why we cannot add it to the single device full finetuning. We'd...