tianmala

Results 10 issues of tianmala

Traceback (most recent call last): File "scripts/generate_chatllama.py", line 82, in args.tokenizer = str2tokenizer[args.tokenizer](args) File "/home/mo/llama/TencentPretrain/tencentpretrain/utils/tokenizers.py", line 255, in __init__ super().__init__(args, is_src) File "/home/mo/llama/TencentPretrain/tencentpretrain/utils/tokenizers.py", line 30, in __init__ self.sp_model.Load(spm_model_path) File "/home/mo/miniconda3/envs/llm_env/lib/python3.8/site-packages/sentencepiece/__init__.py",...

Could you tell me if fastertransformer support for chatglm or glm model? If not, do you have plan to support glm model?

Thanks for this repo.Excellent recognition results! Looking forward to open source code for Android platform deployment.

enhancement

i can gradle successfully this project, but when i run this app , app will crash The logcat are: ``` --------- beginning of main --------- beginning of kernel --------- beginning...

I run the commands according to readme.md about android Step 2, ./gradlew build but I don't find the directory app/.cxx/cmake/release in Step 3. I don't know how to solve it...

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/mo/chatglm/transformers_tasks/LLM/finetune/train.py:352 in │ │ │ │ 349 │ │ 350 │ │ 351 if __name__ == "__main__": │ │ ❱ 352 │...

请问使用chatglm,和单卡对比,多卡推理的响应速度一般多少?Int4的推理速度和fp16的大概是多少呢?谢谢

![image](https://user-images.githubusercontent.com/19283764/231936409-dade4678-2c82-4925-9d43-c380988d5095.png)

这种方法好像和prompt的结果有点类似,就是输出是不太稳定,不是一定会输出正确答案。我使用了example自带的例子去跑,发现结果很不稳定。 ![image](https://github.com/hiyouga/FastEdit/assets/19283764/2855299e-06d5-42bf-9245-8465a9b06fb7) ![image](https://github.com/hiyouga/FastEdit/assets/19283764/988f83f5-b83c-45a2-88b3-ac7a06c87268)

近期,mnn主分支做了较大更新,模型的内存和速度都有较大提升,请问目前的mnn-llm是否支持最新master分支的mnn