Debug TiDB Cloud Documentation: Explore Your Data with AI-Powered Chat2Query (beta)
This issue is a sub-issue of Debug TiDB Cloud Documentation: Summary Issue · Issue #15480 · pingcap/docs. The purpose of this sub-issue is to verify and debug the Explore Your Data with AI-Powered Chat2Query (beta) document.
You can follow the instructions provided in #15480 to verify and debug the instructions in this document.
- After finishing your verification, please add your verification result to this sub-issue as a comment. The result can be the issues you encounter, the mistakes you find, or any other findings. If everything looks fine, you can also add it as a comment.
- For any issues you found during the verification, welcome to create a pull request (PR) to fix them directly. In the pull request, please indicate which issue this PR resolves in the PR description (for example, fix #15742). To learn how to create a pull request, see TiDB Documentation Contributing Guide.
Note: Currently, the TiDB Cloud documentation is in English only and it is stored in the
release-7.5branch of pingcap/docs for reusing the SQL documentation of TiDB. Hence, to create a pull request for TiDB Cloud documentation, make sure that your PR is based on the release-7.5 branch.
Your contribution to testing and verifying the documentation is highly appreciated!
/assign
The main issue with this document is that it does not provide any examples or detailed instructions on how to generate queries with AI. These are the only instructions:
If AI is enabled, simply type -- followed by your instructions to let AI generate SQL queries automatically or write SQL queries manually. For a SQL query generated by AI, you can accept it by pressing Tab and then further edit it if needed, or reject it by pressing Esc.
This is unhelpful. I tried winging it and typed this into Chat2Query:
-- show all tables in database fortune500
After typing it, I had to press enter to get a response. This should be specified in the documentation as well, as otherwise the user might hit the Run or Explain button instead.
The response I got was: Unable to generate SQL as no database is specified. Please identify database in statement "use database".
In fact, that was the same response to anything I typed after -- unless I first entered use <database>. If that's the case, then that should be a part of the instructions.
There are no examples on how to formulate an instruction to the AI. There should be a couple of examples to get people started, as well as to demonstrate the benefit. I tested generating simple SQL queries, ex.:
-- show all companies with number of employees less than 10000 and based in the USA, no duplicates
SELECT DISTINCT
`company_name`
FROM
`fortune500_2018_2022`
WHERE
`employees_num` < 10000
AND `country` = 'USA';
and it seems to work great. It's a nice use of AI!
To sum up, my main suggestion for this page: Make it more clear how to use the chatbot by:
- Telling users to press enter after entering their instructions to the AI
- Make it clear they must first specify a database before the AI will work
- Provide a couple of example instructions to AI and the resulting generated queries.
Less major items.
This Note near the top of the page:
Note Chat2Query is supported for TiDB clusters that are v6.5.0 or later and are hosted on AWS.
For TiDB Serverless clusters, Chat2Query is available by default. For TiDB Dedicated clusters, Chat2Query is only available upon request. To use Chat2Query on TiDB Dedicated clusters, contact TiDB Cloud support.
And this second bullet point in the Limitation section:
The Chat2Query API is available for TiDB Serverless clusters. To use the Chat2Query API on TiDB Dedicated clusters, contact TiDB Cloud support.
These two sections address a similar issue, though the Note is about Chat2Query in general, and the Limitation bullet is about the Chat2Query API specifically. I would combine them.
I recommend consolidating the Note with the Limitation section by revising the Limitation section's second bullet as follows:
Chat2Query is only supported for TiDB clusters that are v6.5.0 or later and hosted on AWS. Chat2Query and the Chat2Query API are available by default for TiDB Serverless clusters. To use Chat2Query or the Chat2Query API on TiDB Dedicated clusters, contact TiDB Cloud support.
It should also be Limitations and not Limitation. I would remove the Note and move the Limitations section into its place, above the Use Cases section.
Also in the Limitation section, this sentence is not well formed:
SQL queries generated by the AI are not 100% accurate and might still need your further tweak.
Recommend rewording:
SQL queries generated by the AI may not be 100% accurate, and you may need to refine them.
I will submit a PR for these items.
Hi @minaelee, thanks so much for your valuable feedback❤️. Indeed, incorporating detailed instructions and examples on generating queries with AI will greatly assist users in quickly getting started and showcasing the AI benefits. Let's leave this sub-issue open until we include this information in the document.
And I appreciate your insightful refinements to the notes and limitations sections in this document👍. I've added a few comments to the PR (https://github.com/pingcap/docs/pull/16093) you created. PTAL. Thanks.