南栖 issues

Results 28 issues of


                                            南栖

HTTP request sent, awaiting response... 403 Forbidden

I accidentally deleted the tokenizer.model when I started download.sh. When I repeated the download, it had already been 403 forbidden, so it could not be downloaded (maybe the download link...

probability tensor contains either `inf`, `nan` or element < 0

CUDA_VISIBLE_DEVICES=0 python llama_inference.py decapoda-research/llama-7b-hf --wbits 4 --load llama7b-4bit.pt --text "this is llama" Loading model ... Done. Traceback (most recent call last): File "llama_inference.py", line 115, in generated_ids = model.generate( File...

加快模型推理速度，代码改进方法（作者看下）

第一步，融合lora 和原模型： ![image](https://user-images.githubusercontent.com/76865636/231760199-c2d7e9f2-4c43-48da-b8ba-d5e4ced4510e.png) 然后融合后推理就行了 ![image](https://user-images.githubusercontent.com/76865636/231759911-4e2c3fe7-1290-4a61-869a-d9c43544d4b6.png) 第一步，融合lora 和原模型后可以用bitsandbytes量化然后推理会显著降低显存，但是效果没尝试，后来试了下好像会让lora失效，所以我觉得这个bitsandbytes量化应该在训练时使用：

怎么使用训练好的lora做int8推理？

关于效果的疑问

为什么vicuna13b只用了7万条指令数据就可以达到chatgpt的90%，而咱们这个项目用了指令数据都上百万条了，按理来说大模型的语言迁移能力应该很强啊，还是说vicuna的评测不够全面？

评分是否有chatgpt的得分，好做个对比

stale

Please support quip# 2bit quantization for the qwen model

Here are some of the effort I've tried, but it still doesn't work: https://github.com/Cornell-RelaxML/quip-sharp/issues/15 https://github.com/Cornell-RelaxML/quip-sharp/issues/30 https://github.com/Minami-su/quip-sharp-qwen QuIP# method, a weights-only quantization method that is able to achieve near fp16 performance...

UnboundLocalError: local variable 'groupwise_qweight_safetensors' referenced before assignment

``` python build.py --hf_model_dir Qwen-7B-Chat \ > --quant_ckpt_path ./qwen_7b_4bit_gs128_awq.pt \ > --dtype float16 \ > --remove_input_padding \ > --use_gpt_attention_plugin float16 \ > --enable_context_fmha \ > --use_gemm_plugin float16 \ > --use_weight_only...

bug

triaged

Why does model.safetenors.index.json not have lm_head.weight?

Feature: Request to add my profile

Thanks for creating this repo. I made a few changes to my GitHub Readme profile. Have a look. https://github.com/Minami-su I would be happy if my profile gets added. ![minami-su-profile](https://github.com/abhisheknaiidu/awesome-github-profile-readme/assets/76865636/c7b669d3-ce53-4484-bc14-601fd2d221a3)