Kartikay Khandelwal issues

Results 13 issues of


                                            Kartikay Khandelwal

Adding support for Llama2 70B LoRA finetuning

#### Context As per title #### Changelog - Builder function + config #### Test plan - Trained for one epoch with the following loss   - Training Speed ![image](https://github.com/pytorch/torchtune/assets/47255723/c964b0f4-3f46-47a2-92c7-856791c0be93)  ...

CLA Signed

[RFC] Single Device Full Fine-tune for Llama7B in < 16GB

## Context On a single device, our current Llama7B full fine-tune recipe either OOMs with the ```AdamW``` optimizer, or takes > 55GB with ```SGD```. Given the importance of single device...

CLA Signed

[Backlog] Consolidated tracker for potential feature requests

Creating a single tracker for potential feature requests which are currently not on the roadmap. This will help with tracking and prioritization, and remove issues with no context and no...

enhancement

Adding information around total run time to the documentation

@iseeyuan had a great suggestion that we should add information about expected run time to our different tutorials and docs. Otherwise it's unclear to a first time user on what...

fp32 Full Training seems to be taking a lot of memory

On 6 GPUs this is taking ~30GB/device which doesn't seem right. This needs some debugging.

Verify model numerics for true bf16 training

Make sure the RoPE embeddings and norms are being correctly computed when training with full bf16.

Builder Functions and when to use them

We make heavy use of builder functions for instantiating specific model architectures from generalized building blocks. For example, the [llama2 builder function](https://github.com/pytorch-labs/torchtune/blob/main/torchtune/models/llama2.py#L77) is used to stitch together the components needed...

documentation

best practice

Kartikay Khandelwal