linzi687
linzi687
text2sql task cann't train,Claude 的分析原因 最终结论 问题定位 vLLM服务层完全正常,问题出在任务调度层: 1. ✅ 模型已加载(GPU 1: 29.4GB) 2. ✅ vLLM服务运行正常(可以成功推理) 3. ✅ 任务已入队(5个rollout任务) 4. ✅ 10个AgentLoopWorker进程存在 5. ❌ TaskRunner无法将任务分发给Worker 问题原因 Agent-Lightning框架的TaskRunner -> AgentLoopWorker任务分发机制存在bug 表现为:...
2025-11-13 11:58:20,317 [ERROR] (Process-1990274 agentlightning.execution.client_server) Runner 7 crashed; signaling stop event Traceback (most recent call last): File "/root/miniconda3/envs/financial_text2sql/lib/python3.10/site-packages/agentlightning/execution/client_server.py", line 190, in _execute_runner await runner(client_store, worker_id, stop_evt) File "/root/miniconda3/envs/financial_text2sql/lib/python3.10/site-packages/agentlightning/trainer/trainer.py", line 539,...
Traceback (most recent call last): File "/root/miniconda3/envs/financial_text2sql/lib/python3.10/site-packages/agentlightning/execution/client_server.py", line 386, in execute asyncio.run(self._execute_algorithm(algorithm, store, stop_evt)) File "/root/miniconda3/envs/financial_text2sql/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/root/miniconda3/envs/financial_text2sql/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete return future.result()...
要怎么支持hive数据库呢