chenyangjun
chenyangjun
我觉得可以加个Django可以实现
补充:doc_tokens是question+后面接问题下所有相关文章的文本,是吧?
When I use vllm to load ChatGLM2 model that trained with "quantization_bit 8", it seems to be not supported. The code below is the original code in ChatGLM2 "modeling_chatglm.py". If...
> 微调数据里混合一些通用的问答数据或者指令数据试试 通用数据规模那么大,种类也很多,这个操作起来不现实啊
``` class ChatglmModel(BaseModel): def process_response(self, response): response = response.strip() response = response.replace("[[训练时间]]", "2023年") return response def is_stop(self, token_id): return token_id
> **转换代码:** import time, torch, os from transformers import AutoModel, AutoTokenizer from fastllm_pytools import llm > > model_path = "chatglm3-5e-4-30-2_1128_export" tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda() model =...