TransnormerLLM icon indicating copy to clipboard operation
TransnormerLLM copied to clipboard

Official implementation of TransNormerLLM: A Faster and Better LLM

Results 6 TransnormerLLM issues
Sort by recently updated
recently updated
newest added

Hi. Thanks for the nice triton implementation. Maybe I found a bug in the triton operator. It seems that the operator does not support head dim=192, but it supports dim=128...

I tested transnormerllm-385m with llm-eval-harness for boolq benchmark. However, the result is not aligned to that result you have reported. As well as boolq benchmark, and 385m model, other benchmarks...

你好,看到这个项目很是激动! 可惜150b实在太大了,是否会计划有13b左右的版本放出呢 (比如用于代码等任务)

Thanks for your great work. When will the code be released?

hello, I have two questions I’d like to ask: 1. In this repository, I noticed that the implementations of lightning attention1 and lightning attention2 appear identical 2. The implementation of...