zyf-gh issues

Repositories
Issues
Comments

Results 4 issues of


                                            zyf-gh

How to determine shadowLayers

I found that in `modeling_phonelm_npu`, `shadowLayers` is `{0,1,3,4}`, while in `modeling_qwen_npu`,` shadowLayers` is `{1,2,26}`. How are `shadowlayers` known?

About quantized W8A8 model

I found that the quantization algorithm has fixed supported models. If I want to perform int8 quantization on my own custom model, how can I do it?

When I execute `outputs.printData()` or `outputs.saveData()` on ouputs in the `Forward function` in the `src/models/phonelm/modeling_phonelm.hpp/PhoneLMForCausalLM` , a `Segmentation fault` occurs. How can I get the data of outputs ?

migrate kv cache to pytorch

I would like to ask if the kv cache generated in the prefilling stage can be used as pytorch's kvcache to allow pytorch to perform subsequent decoding work on another...

zyf-gh

How to determine shadowLayers

About quantized W8A8 model

printData and saveData error

migrate kv cache to pytorch