Philip Bontrager issues

Results 9 issues of


                                            Philip Bontrager

Batch size changes with number of processes

Currently our batch size is a local batch size. This means with a bs=4, if you launch on 4 gpus then each gpu gets 4 data points and your real...

best practice

Add LoRA and QLoRA Components, Models, and Configs for Phi3 Mini

#### Context What is the purpose of this PR? Is it to - [x] add a new feature - [ ] fix a bug - [ ] update tests and/or...

CLA Signed

Flamingo Model Components

#### Context What is the purpose of this PR? Is it to - [x] add a new feature - [ ] fix a bug - [ ] update tests and/or...

CLA Signed

[RFC] Fusion Models

# [RFC] Fusion Models **TLDR** - Fused Models are two+ pre-trained models joined together and further tuned to work as one model. This is the approach used for most SOTA...

CLA Signed

QLoRA with Llama 3.1 405B

#### Context What is the purpose of this PR? Is it to - [x] add a new feature - [ ] fix a bug - [ ] update tests and/or...

CLA Signed

[RFC] TransformerDecoderLayer Refactor

# [RFC] TransformerDecoderLayer Refactor Refactor TransformerDecoder so it can be used for multimodal architectures. **TLDR** - Replace TransformerDecoderLayer with TransformerSelfAttention and TransformerCrossAttention - Replace CausalSelfAttention with GroupedQueryAttention - Support legacy...

CLA Signed

Philip Bontrager

Batch size changes with number of processes

Add LoRA and QLoRA Components, Models, and Configs for Phi3 Mini

Flamingo Model Components

[RFC] Fusion Models

QLoRA with Llama 3.1 405B

[RFC] TransformerDecoderLayer Refactor

Not saving Phi3 Mini LoRA Adapter Checkpoint

Assert LLama Vision Image size divides by 14

Refactor Collation