jingnorth
jingnorth
assistant Oops! Response Exception 试了llama3-70b,上述这个回答是否是资源不够的问题,下面是ollama的日志 May 10 21:19:13 wae ollama[14972]: {"function":"process_single_task","level":"INFO","line":1507,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":97,"tid":"140514672760704","times tamp":1715347153} May 10 21:19:13 wae ollama[14972]: {"function":"log_server_request","level":"INFO","line":2735,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"127.0.0.1","remote_port":57 168,"status":200,"tid":"140473610786560","timestamp":1715347153} May 10 21:19:13 wae ollama[14972]: {"function":"log_server_request","level":"INFO","line":2735,"method":"POST","msg":"request","params":{},"path":"/tokenize","remote_addr":"127.0.0.1","remote_port" :57168,"status":200,"tid":"140473610786560","timestamp":1715347153} May 10 21:19:13...
之前测试了本地知识库,发现返回的结果并不理想。想剥离RAG的本地知识库,直接剥离一些数据用上下文的方式测试下是大模型的问题,还是RAG提供的搜索数据的问题。在llama中文社区用llama3-8b测试回答的很理想,各种准确。回来在咱们的系统上不挂本地知识库,用同样的上下文测试llama3-8b-q4,llama2-13b-q4,qwen-4b-q4,结果还是不理想,我的问题是量化4b后模型差距这么大吗?还是ollama调用的模型有什么变化?我们chat-ollama在调用模型时是否有什么特殊的处理?很困扰
下了1.5b 和 7b, 跑了都出错,和比较大的参数的模型一样,感觉是超时报错了