Sehun Heo issues

Results 5 issues of


                                            Sehun Heo

Bugfix : Prompt Inference 실행시, 모델의 device setting이 flag와 맞지 않는 버그 해결

## 버그 발생 상황 GPU 환경에서 기존의 코드를 동작시킬 때 아래와 같은 에러가 발생합니다. ``` RuntimeError: Expected all tensors to be on the same device, but found at least two...

[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5

### Has this been raised before? - [X] I have checked [the GitHub README](https://github.com/QwenLM/Qwen2.5). - [X] I have checked [the Qwen documentation](https://qwen.readthedocs.io) and cannot find an answer there. - [X]...

How to visualize Attention Visualization

Hello, thank you for sharing your excellent methodology. As I was reviewing your paper, I have a question. My Question is: How do you implement attention visualization for the retrieved...

Loss does not drop when using Liger Kernel at Qwen2.5

### 🐛 Describe the bug I am trying to instruction tuning Qwen2.5-14B-Instruct with [Liger Kernel](https://github.com/linkedin/Liger-Kernel). I know that the liger kernel is supported in the dev version of huggingface transformers....

[Feature]: supporting MllamaForCausalLM

### 🚀 The feature, motivation and pitch `MllamaForConditionalGeneration` models (such as, `meta-llama/Llama-3.2-90B-Vision-Instruct`, `meta-llama/Llama-3.2-11B-Vision`, etc.) are composed of `MllamaVisionModel` and`MllamaForCausalLM`. I want to use only `MllamaForCausualLM` and this for, i can...

feature request