punica issues

ImportError: cannot import name 'BatchedKvCache' from 'punica'

2

Everything go well when I install punica from binary package. However, it shows "ImportError: cannot import name 'BatchedKvCache' from 'punica'" when I run "python -m benchmarks.bench_textgen_lora --system punica --batch-size 32"....

VisionwWu

sgmv_cutlass calculate wrong output

8

I'm running the following code and find the answer goes wrong. I initialize the `x` and `w` to be all ones. So the output `y` value should be `h1=4096`. But...

harryhan618

About scheduler

2

Thank you for your great work! May I ask about some details on the scheduler? 1. In paper, it is mentioned that "To minimize latency penalty, we limit the prefill...

luzai

Support for GPT-NEOX models

Hello! Thank you for this awesome work. I am testing `Punica` for serving my custom models and it has GPT-NEOX model as the base model. Currently, does `Punica` support other...

bibekyess

Reasons for switching to CUTLASS-based kernel instead of custom kernel

2

Hey folks, awesome and really impactful work with the repo and the paper. I was wondering what was the reason for switching from the original `bgmv` kernel to a CUTLASS-based...

Yard1

punica
punica copied to clipboard

Metadata

ImportError: cannot import name 'BatchedKvCache' from 'punica'

sgmv_cutlass calculate wrong output

About scheduler

Support for GPT-NEOX models

Reasons for switching to CUTLASS-based kernel instead of custom kernel

Error when installing package from source

Has the custom expand kernel completed?

Support different multi-lora adapter?

← Metadata

Owner

Metadata

punica punica copied to clipboard

Metadata

← Metadata

Owner

Metadata

punica
punica copied to clipboard