punica issues

fix(bgmv): write shared_memory y_warpsize only when threadIdx.x == 0

1

should add threadIdx.x == 0, when you want to write y_warpsize. Otherwise it will lead the wrong answer.

menggeliu1205

[Feature Request] Add support for SM75

4

Any plans to add support for SM75 like V100 GPUs? Thank you!

sleepwalker2017

RunTimeError: output must be a cuda tensor

3

Hi! I tried using the benchmark text generation `python -m benchmarks.bench_textgen_lora --system punica --batch-size 32` but when I did I got a runtime error stating the output should be a...

iskander-sauma-assessio

[Question] How to avoid matrices conflict

> Assuming W of shape [H1, H2] is the weight of the pretrained model, LoRA adds two small matrices A of shape [H1, r] and B of [r, H2]. Running...

gyliu513

Support for H100 GPUs?

1

When I set `TORCH_CUDA_ARCH_LIST="8.0 8.6 8.9 9.0"`, I got compiling errors. And then I found: https://github.com/punica-ai/punica/blob/591b59899f0a20760821785d06b331c8a2e5cb86/.github/workflows/release_wheel.yml#L15 Is there something we does not support yet? Thank you in advance! Update: Adding...

LorrinWWW

Support qwen?

1

Thanks!

cgq0816

Add support for running on Colab

4

I'm not able to install this library on Colab. I tried this ```bash git clone https://github.com/punica-ai/punica cd punica && pip install . ``` But this is failing with the following...

dzlab

Multi GPU and Multi Node solution

4

I wanted to know how to use Multi-GPUs and Multi-Node solutions with the current Punica code. Also wanted to know about the runner and scheduler code which is mentioned in...

luciferlinx101

Inquiry on cuda memory across processes

Hi, Congratulations on the great work you have done! I am very interested in your work. Specifically, I want to know how you allow multiple serving processes to share the...

mozizhao

chore(master): release 1.1.1

:robot: I have created a release *beep* *boop* --- ## [1.1.1](https://github.com/punica-ai/punica/compare/v1.1.0...v1.1.1) (2024-01-09) ### Bug Fixes * **sgmv:** deadlock in sgmv_shrink kernel caused by skewed segments ([#35](https://github.com/punica-ai/punica/issues/35)) ([591b598](https://github.com/punica-ai/punica/commit/591b59899f0a20760821785d06b331c8a2e5cb86)) --- This PR...

github-actions[bot]

autorelease: pending

punica
punica copied to clipboard

Metadata

fix(bgmv): write shared_memory y_warpsize only when threadIdx.x == 0

[Feature Request] Add support for SM75

RunTimeError: output must be a cuda tensor

[Question] How to avoid matrices conflict

Support for H100 GPUs?

Support qwen?

Add support for running on Colab

Multi GPU and Multi Node solution

Inquiry on cuda memory across processes

chore(master): release 1.1.1

← Metadata

Owner

Metadata

punica punica copied to clipboard

Metadata

← Metadata

Owner

Metadata

punica
punica copied to clipboard