lingq1

Results 5 issues of lingq1

我使用下面的代码本意想要得到的结果是没有任何数据跑高,但qwen-72b大模型识别处理的不太正确, 有时候会识别出江苏徐州八线8移动有时候会识别出别的,请问有什么更好的解决方案吗,还是我代码用的不对: ` import datetime import pprint import json5 from qwen_agent.agents import Assistant from qwen_agent.tools.base import BaseTool, register_tool @register_tool('origin_alarm_query_v2') class OriginAlarmGenerateV2(BaseTool): description = '提供公司内部告警的详细信息' parameters = [{ 'name': 'alarm_name', 'type':...

I used the script `tune run --nnodes 1 --nproc_per_node 2 knowledge_distillation_distributed --config llama3_2/8B_to_1B_KD_lora_distributed` to distill the model and saved the results to `/tmp/torchtune/llama3_2_8B_to_1B/KD_lora_distributed`. How can I run my 1B model...

discussion

Is pruning of large models similar to 70B supported?

请问这个项目支持72B的大模型剪枝吗

Is pruning of large models similar to 70B supported?