Luo Yu issues

Repositories
Issues
Comments

Results 3 issues of


                                            Luo Yu

[BesTLA] Support int5&int6 for kernels and models

## Type of Change Add new weight_dtype: int5 and int6 Support model quantization of int5 and int6

[SYCL]subgroup shuffle bug when sg_size=32

### Describe the bug I'm debugging the SYCL backend of [llama.cpp](https://github.com/ggerganov/llama.cpp). I found some kernel output `-nan` when built with Debug. The root cause is that ```cpp sycl::select_from_group(g, x, target_offset...

bug

confirmed

sync SYCL code

## Type of Change update the SYCL performance. ```shell llama2-7b int4, sym, g128, comp_dtype=fp32, scale_dtype=fp32, KV_dtype=fp32 Max1100: 8.6ms/token A770: 14.5ms/token A770m: 15.8ms/token A750: 15.4ms/token 155H: 51.6ms/token ``` ```shell cmake .....