community icon indicating copy to clipboard operation
community copied to clipboard

Proposal for Enhancing Istio Community Engagement with AI-Integrated Q&A Bot

Open rootsongjc opened this issue 1 year ago • 10 comments

Here's the current process for creating a Q&A Discussion in the Istio community:

  1. Users submit a question here with a title and body.
  2. They await community responses, with the option to ask follow-up questions.
  3. Some users select an answer, but the question remains open.

This works well with a small volume of questions. However, as the number of questions grows, certain challenges become apparent:

  1. There's a lack of automation. Categorizing questions like GitHub issues would be beneficial.
  2. Currently, there are bots like istio-policy-bot, istio-release-robot, and istio-testing, but they already serve specific purposes. Perhaps a new bot, tentatively named istio-qa-bot, could be created.
  3. Encouraging users to select the correct answer is needed.
  4. Utilizing AI to reformat user questions for better readability and providing answers can be explored.

I wrote my proposal on this Google doc. Feel free to comment or ask me anything about this proposal. Hope the Istio community to consider it.

rootsongjc avatar Jan 05 '24 07:01 rootsongjc

Thanks @rootsongjc for this initiative and proposal!

@howardjohn @linsun @craigbox @kfaseela @rcernich @justinpettit @ctrath @hzxuzhonghu please take a look for the proposal.

irisdingbj avatar Jan 05 '24 18:01 irisdingbj

Hi @rootsongjc thanks for proposing this. I saw the cost and complexity worry from other steering members which makes sense to me. Is it possible to start with a simple version where we simply alert users to check out a list of things before writing up the post in discuss?

linsun avatar Jan 11 '24 15:01 linsun

I want to adapt and utilize the existing bots. What is available now? I've heard of some bots no longer planned to be maintained.

rootsongjc avatar Jan 12 '24 07:01 rootsongjc

Any further thought Jimmy? I think that having a model trained on Istio documentation attempting to manually answer some user queries would be a good first step. You have such a model?

craigbox avatar Mar 12 '24 00:03 craigbox

@craigbox I don't have a direct model available for open source yet. But we can train one, and there's also the technology or platform to consider using, and there's the cost involved when we have to integrate the model to call the APIs, and I haven't figured out how to do all that yet.

rootsongjc avatar Mar 12 '24 02:03 rootsongjc

@rootsongjc is there still interest in doing this?

@craigbox has provided some feedback for my tool (https://devboard.gitsense.com/istio) and I'm looking to convert all my data into embeddings for future AI features and I would be interested in working with the Istio to gather requirements. I'm currently capturing comments (up to 30,000 per repository due to GitHub limitations), issues, pull request, commits, etc. so experimenting with different types of embedding models and chunking methods will be trivial.

My only concern right now is performance (not my indexing engine but rather the process of generating embeddings). My indexing engine can scale horizontally and you can rent GPUs by the hour, so I'm hoping there will only be a one time initial cost hit and from then on, commodity hardware can be used

terrchen avatar Apr 09 '24 19:04 terrchen

What Jimmy is proposing is effectively training a transformer model on the Istio and Envoy documentation, Q&A etc, and then using that to answer user questions in the first instance.

Some GitHub data might be useful in this, but I think that is more likely to trend towards developer questions/answers.

craigbox avatar Apr 12 '24 02:04 craigbox

I'm actually interested in generating embeddings that can support multiple personas (customers, developers, managers, executives, etc.) so what Jimmy is proposing does interest me, but I can see how including more development related data, can increase complexity.

I'm not sure what stage things are at, but I am very interested in learning more about your findings and knowing what sources (documents, questions, etc.) you are planning on training on.

terrchen avatar Apr 12 '24 14:04 terrchen

@terrchen I haven't started yet. Your tool is very useful for showing the contribution data but do you have any bot to answer questions in the the GIthub issue or discussions?

rootsongjc avatar Apr 15 '24 03:04 rootsongjc

@rootsongjc Not yet. The goal is to get to a point where I can create agents/bots to answer questions and to perform tasks for maintainers, developers, team leads, etc.

The conclusion that I've come to is, in order to create a useful automated Q and A system, you'll need an easy way to do the following:

  1. Classify and clean data.
  2. Generate Q and A pairs.
  3. Review and iterate on generated Q and A pairs

The part that I will be tackling in the near future is creating a system to classify and clean data, which is the most critical, since garbage in = garbage out. Once you have clean data (with good metadata), generating Q and A pairs should be pretty straight forward as you can use LLMs to generate them.

If you have a system or thoughts on how to clean/organize the data, I'd be interested to hear them. I'm currently planning on creating a system that will leverage LLM to help classify and prep data.

terrchen avatar Apr 15 '24 15:04 terrchen

@rootsongjc hey Jimmy, in the context of the advance of LLMs and of things like https://tetrate.io/blog/introducing-istio-advisor-plus-gpt/, do you still think this is worth pursuing?

craigbox avatar May 05 '25 02:05 craigbox

(You might also like to check your current advisor, as I think it's a little out of date)

Image

(The current ChatGPT does a bit better, though doesn't get it entirely right)

craigbox avatar May 05 '25 02:05 craigbox

@craigbox Hey Craig, thanks for checking in.

Given the progress with tools like Istio Advisor GPT, and the direction things are heading with LLMs, I don't think pursuing a separate Istio Bot still makes sense. I’m also short on time and won’t be able to contribute to it going forward. I’ll try to find some time to update Istio Advisor GPT.

rootsongjc avatar May 05 '25 03:05 rootsongjc