Amos You
Amos You
fixes duplicate formatting for classes inheriting from NamedTuples as seen in #669
Adds the Eve optimizer in wrapper form in response to #475
Hi, Thanks for the amazing work! I was wondering if there's plans on releasing the model weights after the code release? It wasn't mentioned in the paper so I wanted...
## Motivation [NVLM-D](https://nvlm-project.github.io/) is a decoder-only VLM released by NVIDIA recently that has great performance across different benchmarks. I've been trying to get a better understanding of VLM/LLM architectures, figured...
## Motivation The Triton kernel for decode attention was updated with a new backend interface in #3292, breaking the benchmark code. ## Modifications Corrected the import for `should_use_tensor_core` and replaced...