Hongyi Jin issues

Repositories
Issues
Comments

Results 3 issues of


                                            Hongyi Jin

Change vocab size to support vicuna v1

Vicuna v0's vocab_size is 32001, but v1's vocab size is 32000. So we need to update the manual schedule.

Enable weight compression in GPU

This PR enables weight compression in GPU. Previously the weight compression is run in CPU because the uncompressed weight is too large to fit in GPU, and running on CPU...

[Dlight] Scheduling Low batch GEMM using GEMV-like rule

1. add a dlight rule LowBatchGEMV to schedule low-batch GEMM just like GEMV. 2. fix some issues when lowering low-batch GEMM