Andre Slavescu

Results 13 issues of Andre Slavescu

*Description of changes:* - added rich (version==13.3.2) to requirements.txt (experienced dependency conflict without) - added z-loss

reference to issue https://github.com/vllm-project/vllm/issues/198

Just as a simple suggestion, I'm wondering if there can be a discussion tab to post problems with running the system as opposed to posting individual issues. This can keep...

gelu implementation. process in chunks + usage of hardware intrinsics for slightly faster performance. Results: (original implementation) GPU cuda_gelu execution times: [13.78, 13.26, 13.95, 13.0, 12.93, 13.46] (second implementation) GPU...

## Goal: - decide which optimizers are worth implementing in a priority order ## Remaining Optimizers + General Usage from my understanding there are really only 2 from the ones...

enhancement

Would it be an idea to define the same kernels that exist in the CUDA backend with ThunderKittens as well? They have cool examples with FlashAttention2 and I think it...

- [x] Restructure Makefile (automate detection of compute capability) - [ ] Optimize existing kernels

ref: https://github.com/castorini/ura-projects/issues/39