DB-GPT icon indicating copy to clipboard operation
DB-GPT copied to clipboard

[BUG]: 关于直接执行结果sql生成功能中的prompt的内容

Open xiangh8 opened this issue 1 year ago • 1 comments

关于prompt,不明白为什么不在最后加上一个类似"ASSISTENT:"的内容,目前,我发现给的prompt往往以###结尾,如

You are an AI designed to answer human questions, please follow the prompts and conventions of the system's input for your answers###human:查询aaa数据库中的所有信息###system:\nYou are a SQL expert. Given an input question, first create a syntactically correct mysql query to run, then look at the results of the query and return the answer.\nUnless the user specifies in his question a specific number of examples he wishes to obtain, always limit your query to at most 5 results. \nYou can order the results by a relevant column to return the most interesting examples in the database.\nNever query for all the columns from a specific table, only ask for a the few relevant columns given the question.\nPay attention to use only the column names that you can see in the schema description. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.\n\nOnly use the following tables:\n[('user(id,name)',)]\n\nQuestion: 查询aaa数据库中的所有信息\n\nYou must respond in JSON format as following format:\n"{\n \"thoughts\": {\n \"reasoning\": \"reasoning\",\n \"speak\": \"thoughts summary to say to user\"\n },\n \"sql\": \"SQL Query to run\"\n}"\n\nEnsure the response is correct json and can be parsed by Python json.loads\n###

这可能导致一些时候模型根本没有生成任何东西就返回了。(vicuna的role是否设为USER和ASSISTENT比较好?) 此外,不知道返回的格式能否不使用嵌套json?如,目前的返回格式为

#pilot/scene/chat_db/prompt.py
RESPONSE_FORMAT = {
    "thoughts": {
        "reasoning": "reasoning",
        "speak": "thoughts summary to say to user",
    },
    "sql": "SQL Query to run",
}

嵌套的json格式使模型很多时候即使生成了正确的sql也无法执行,因为格式出错了(╯﹏╰),能否换为一层的json,在webserver组装?目前我的应对方法是将prompt格式改为

#pilot/scene/chat_db/prompt.py
RESPONSE_FORMAT = {
    "reasoning": "reasoning",
    "speak": "thoughts summary to say to user",
    "sql": "SQL Query to run",
}

并在pilot/scene/chat_db/out_parser.py中组装

  cindex = cleaned_output.find('{')
  cleaned_output = cleaned_output[cindex:] if cindex != -1 else cleaned_output
  cleaned_output = cleaned_output[:cleaned_output.rindex('}')+ 1]
  response = json.loads(cleaned_output)
  mythoughts={
      "reasoning": response["reasoning"],
      "speak": response["speak"]
  }
  json_data = json.dumps(mythoughts, ensure_ascii=False)
  json_object = json.loads(json_data)

当然,这只针对可执行sql的选项,而不针对知识库问答,因此我不确定这样的prompt设置是否有其他作用。(因为我今天一直在研究这个功能)但仅针对直接执行结果的sql生成,这样似乎要更加稳定

xiangh8 avatar May 26 '23 14:05 xiangh8

Good suggestion. 你直接提个pr吧😁

csunny avatar May 26 '23 16:05 csunny

大哥,怎么样才能使用直接执行结果

WindFlowUpTheMoon avatar May 31 '23 07:05 WindFlowUpTheMoon

This feature has already been released. You can pull the latest code and run it in SQL auto-execution mode. @WindFlowUpTheMoon

Maybe you can visitor this issue #96

csunny avatar Jun 02 '23 08:06 csunny