ebsmothers issues

Results 23 issues of


                                            ebsmothers

CoCa multimodal transformer layer implementation

Hi, thanks for your CoCa implementation! I have a question on the multimodal transformer: typically in a decoder layer I would expect to see self-attention, then cross-attention, then an MLP....

Testing mega-issue

Creating this issue to track gaps in our current testing. ### Tests to write - [ ] Add gradient accumulation test for LoRA recipe (ideally also testing LR scheduler) -...

very wip metric logger improvements

Main changes: log on every step, accumulate metrics correctly over iterations, scrap log_memory_stats_every_n_steps and consolidate with existing log_every_n_steps. Still need to test I didn't break anything. If we like this...

CLA Signed

Update PR template

Updating our default PR template to hopefully make it a bit easier and clearer on testing/sanity checks to run when opening a PR. New template pasted below:

CLA Signed

Default to llama3-8b-instruct

There are some gotchas around usage of the base llama3 fine-tuned models with respect to special tokens. While we should smooth these out and make it easy to use for...

CLA Signed

Support stopping on more than just eos during generation

Based on https://github.com/meta-llama/llama3/blob/main/llama/tokenizer.py#L91-L94 and https://github.com/meta-llama/llama3/blob/main/llama/generation.py#L197 we should support stopping on more than one token during generation. This PR adds this field to our tokenizers and integrates it into the generation...

CLA Signed

Test directory structure for models doesn't match corresponding implementation files

Right now we have e.g. `tests/torchtune/models/test_llama3.py` and `tests/torchtune/models/test_lora_llama2.py` test files for our models. This is not in line with our implementation files, which have directories `torchtune/models/llama2`, `torchtune/models/mistral`, etc. We should...

ebsmothers

CoCa multimodal transformer layer implementation

Testing mega-issue

very wip metric logger improvements

Update PR template

Default to llama3-8b-instruct

Support stopping on more than just eos during generation

Test directory structure for models doesn't match corresponding implementation files

Save adapter config and remapped adapter weights for loading into PEFT

ExpectedMoreSplits error on load_dataset when upgrading to 2.19.0

[RFC] Generalize prompting in generate recipe