Varun Sundar Rabindranath

Results 9 comments of Varun Sundar Rabindranath

@bnellnm I added some quant kernel tests. We should definitely add some model tests.

Thanks for the fix @ProExpertProg 🙌

Marking this draft -- These kernels are not a priority at the moment given that a masked-fused-act-mul-quant exists in https://github.com/vllm-project/vllm/tree/ll_deepgemm_opt . We can revive this when needed.

@bnellnm @youkaichao @tlrmchlsmth PTAL! Thanks.

LGTM! This is a nice refactor ! Thanks @LucasWilkinson

> All the LoRA tests have failed again Looking into this now 👍

Update : I enabled tests in `tests/lora/test_layers.py` for V1. The tests work locally but OOM's on the CI - I am tracking this down.

> It seems these modifications have significantly increased the time consumption for lora testing ![image](https://private-user-images.githubusercontent.com/19733142/418847014-be6080be-abc4-4c3e-b913-c9b1fa2de95f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEwOTM3NDYsIm5iZiI6MTc0MTA5MzQ0NiwicGF0aCI6Ii8xOTczMzE0Mi80MTg4NDcwMTQtYmU2MDgwYmUtYWJjNC00YzNlLWI5MTMtYzliMWZhMmRlOTVmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA0VDEzMDQwNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWViOTdkYTM5YTM4YmExY2VmMzQzMzQzNGQ0NTExY2I3NjQ3ZWEzYzgyNGJkZWE3NzhjZGQ3M2YzM2UxM2NhMzcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.q-ij9-uT8vQw3ksOoAXgHrJeEFGVbNKcFgNujeQEGY0) Yes. This PR adds the v1_kernel tests in test_punica_ops.py and enables `test_layers.py` to run for...