Liger-Kernel
Liger-Kernel copied to clipboard
Efficient Triton Kernels for LLM Training
## Summary Fix #272 It's a show case of how to trigger error properly. I only apply it to cross_entropy for demonstration, can apply to others if we want. ##...
## Summary Monkey patches layer norm in mllama for conditional generation ## Testing Done Tested monkey patching works as intended - Hardware Type: - [ ] run `make test` to...
Fixes #305 Fix dtype mismatch in fused_linear_cross_entropy_forward function. * Cast `logits_chunk` to the data type of `_input_chunk` before performing operations on it. --- I tested this in Colab after the...
## Summary Resolve #277. ## Testing Done - Hardware Type: gpu-ci - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [x]...
### π Describe the bug Tensors saved in `medusa_only_heads` mode are empty. Ref: https://github.com/linkedin/Liger-Kernel/blob/main/examples/medusa/train.py#L392 ### Reproduce _No response_ ### Versions N/A
### π Describe the bug #254 [#262 (comments)](https://github.com/linkedin/Liger-Kernel/pull/262#issuecomment-2374260041) PyTorchβs autograd system records operations on tensors to construct a computational graph, which is used for computing gradients. When an in-place operation...
### π Describe the bug I encountered a RuntimeError while running a full fine-tuning experiment using the LLaMA-Factory on a model with BFloat16 precision. The error occurred during the training...
## Summary This PR aims to resolve #197 Implemented z loss in LigerCrossEntropy. note: `lse_square_scale` not exposed at flce yet, having issues passing the tests. ## Details ### For loss:...
### π The feature, motivation and pitch i want to use this for peft training ### Alternatives _No response_ ### Additional context _No response_
### π The feature, motivation and pitch Currently we only have examples for text based models here: https://github.com/linkedin/Liger-Kernel/tree/main/examples/huggingface. An example showing how to run mllama vision model end to end...